GCC Cross-Compiler

From OSDev.wiki
Revision as of 14:37, 30 November 2006 by Combuster (talk | contribs) (Import to mediawiki)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Introduction

What is a cross-compiler?

Generally spoken, a cross-compiler is a compiler that runs on platform A (the "host"), but generates executables for platform B (the "target"). The two "platforms" might differ in CPU, operating system, and/or executable format.

What is a "Canadian Cross"?

The black magic of cross-compiling: On platform A, compile a cross-compiler to run on platform B that generates executables for platform C.

Why an OS developer should build a cross-compiler

Doing Step 1 below to get a dedicated (cross-) compiler for your OS development work can save you many headaches. When your system compiler drags in references to alloca() or other OS-dependent things, your compiler and your assembler can't agree on binary formats, or your bootloader stubbornly insists that it cannot read your kernel binary, the easiest solution could be setting up a dedicated Step 1 cross-compiler. If anything, it places you on the same playground as other users: You can rest assured that any problems you might yet encounter are not specific to your compiler setup.

Don't fear, it's easier than you might think. How this document is organized

We describe a sequence of steps, starting with nothing but your system compiler and ending with a native compiler for the target system. You might not need all those steps (when you want to compile your hobbyist OS into a binary, Step 1 is all you need).

Requirements

We assume you have a host system with a working GCC installation. If you are not using a bash shell, you might have to modify some of the command lines below. If you have just installed the basic Cygwin package, you have to run the setup.exe again and install the GCC and make packages. Flex and bison are also necessary for older versions GCC (< 3.4.x).

Note: Cygwin includes your Windows path in its bash path. This means that if you were using DJGPP before switching to Cygwin (in most cases, that is why you would want to build a GCC Cross-Compiler), you must either uninstall DJGPP first, or at least remove it from your PATH environment variable so the Cygwin tools get called instead of DJGPP. (After uninstalling DJGPP, you should delete the DJGPP environment variable and clear the C:\djgpp (or wherever you installed it) entry in your PATH to make sure everything's going to be all right.)

Tested

This has been tested to work with the below combinations of binutils and gcc:

GCC 3.4.3 3.4.4 3.4.5 4.0.0 4.0.1 4.0.2 4.0.3 4.1.1
Binutils 2.16 Yes Yes Yes Yes Yes Yes Yes ?
2.16.1 Yes Yes Yes Yes Yes Yes ? Yes
2.17 ? ? ? ? ? ? ? Yes

It has also been tested successfully with various combinations of binutils 2.14 / 2.15 and GCC 3.2 / 3.3. The numbers refer to the versions being built, not the host compiler doing the build.

Problems have been reported on trying to build bintils 2.14 with Cygwin GCC < 3.3.3.3 as host compiler, as well as on trying to build binutils <= 2.15 with GCC 4.x as host compiler.

Step 1 - Bootstrap

We build a toolset running on your host that can turn source code into object files for your target system.

We need the binutils and the GCC packages from http://ftp.gnu.org/gnu/. Download them to /usr/src (or whereever you think appropriate), and unpack them.

You don't have to download the whole big gcc-x.x.x package - gcc-core is sufficient to build the C compiler. If you also want C++, download both gcc-core and gcc-g++, and unpack them in the same directory. (The remaining 12 megabyte or so in the main package are for Fortran, ADA, Java, Objective-C, and the test suite.)

Note: The versioning scheme used is that each fullstop seperates a full number, eg. 2.8.0 2.9.0 2.10.0 2.11.0, this may be confusing if you're used to Windows' program's schemes or even just basic math (eg. 1.01 1.02 etc).

Preparation

   export PREFIX=/usr/cross
   export TARGET=i586-elf
   cd /usr/src
   mkdir build-binutils build-gcc

The prefix will configure the build process so that all the files of your cross-compiler environment end up in /usr/cross, without disturbing your "normal" compiler setup.

binutils

   cd /usr/src/build-binutils
   ../binutils-x.xx/configure --target=$TARGET --prefix=$PREFIX --disable-nls
   make all install

This compiles the binutils (assembler, disassembler, and various other useful stuff), runable on your system but handling code in the format specified by $TARGET.

--disable-nls tells binutils not to include native language support. This is basically optional, but reduces dependencies and compile time. It will also result in English-language diagnostics, which the people on the Forum understand when you ask your questions. ;-)

GCC

Now, you can bootstrap GCC. (Use v3.3 or later - GCC 3.2.x has a bug with internal __malloc declarations resulting in an error during compilation. This could be fixed by patching four occurrences in three different source files, but I lost the diff output and am not in a mind of re-checking. ;-) )

   cd /usr/src/build-gcc
   export PATH=$PATH:$PREFIX/bin
   ../gcc-x.x.x/configure --target=$TARGET --prefix=$PREFIX --disable-nls \
       --enable-languages=c,c++ --without-headers --with-newlib
   make all-gcc install-gcc

Explanation of Options

The path has to be extended since GCC needs the binutils we built earlier at some point of the build process. You might want to add these extensions to your $PATH permanently, so you won't have to use fully qualified path names every time you call your cross-compiler.

--disable-nls is the same as for binutils above.

--without-headers tells GCC not to rely on any C library (standard or runtime) being present for the target.

--with-newlib is only necessary if you are compiling GCC <= 3.3.x. That version has a known bug that keeps --without-headers from working correctly. Additionally setting --with-newlib is a workaround for that bug.

--enable-languages tells GCC not to compile all the other language frontends it supports, but only C (and optionally C++).

Summary

Now you have a "naked" cross-compiler. It does not have access to a C library or C runtime yet, so you cannot use any of the standard includes or create runable binaries. But it is quite sufficient to compile your self-made kernel.

Usage

Once you are finished, your toolset resides in /usr/cross. For example, you have a gcc executable in /usr/cross/bin/$TARGET-gcc (and /usr/cross/$TARGET/gcc as well), which spits out binaries for your TARGET. Add /usr/cross/bin to your PATH environment variable, so that gcc invokes your system compiler, and $TARGET-gcc invokes your cross-compiler.

You could also use the -b and -V options of GCC. Let's assume you have a gcc 3.3.3 as system compiler (/usr/bin/gcc), and you just created a gcc 3.4.3 cross-compiler for i586-elf (/usr/cross/bin/i586-elf-gcc). Instead of calling i586-elf-gcc, you could also call gcc -b i586-elf -V 3.4.3...

This sounds strange at first, but it is rather simple: These options, passed to your system compiler, tell it that it's not really the system compiler you want - but rather the one for the target passed by the -b $TARGET option, and in the version passed by the -V $VERSION option. The system compiler turns these two options into a new executable name (namely gcc-$TARGET-$VERSION), and calls that one instead of compiling the source itself. Neat, huh? ;-)

Troubleshooting

i586-elf-ar not found

You forgot to set the executable path ($PATH) to include $PREFIX/bin. Error: junk at end of line, first unrecognized character is ','

This, in combination with lots of other assembly-level error messages (like, Warning: .type pseudo-op used outside of .def/.endef ignored, or Error: unknown pseudo-op: '.local') results when you did not correctly set the --prefix=$PREFIX during the binutils configure.

Another possibility is that you did configure, compile and install your cross-compiler correctly, but don't actually use it. Check the "Usage" section above.

If you try compiling in 64-bit windows, you will receive a "Unknown host machine type" error when running configure. To fix this, scroll up in your shell until right after you entered the configure command and you will see a website which will show you where to download updated files to guess host type. Put them in the root directory of where your source files are located. With GCC version 3.4.0 you will have to override the host enviroment though, as it does not support being compiled with x86_64-unknown-cygwin . Add the command line argument --host=i686-unknown-cygwin to the configure line for GCC. --CjMovie (Too many edits to get this right...)

Step 2 - C Library

In the second step, we will build the C library for your target system.

The cross-compiler from Step 1 will spit errors whenever you want to #include any of the standard headers (except for a select few that actually are platform-independent, and generated by the compiler itself). This is quite correct - you don't have a standard library for the target system yet!

The C standard defines two different kinds of executing environments - "freestanding" and "hosted". While the definition might be rather fuzzy for the average application programmer, it is pretty clear-cut when you're doing OS development: A kernel is "freestanding", everything you do in user space is "hosted".

A "freestanding" environment needs to provide only a subset of the C library: float.h, iso646.h, limits.h, stdarg.h, stdbool.h, stddef.h, and stdint.h (as of C99). All of these consist of typedef s and #define s "only", so you can implement them without a single .c file in sight.

You will probably want to add a "hosted" C library, so you can use all the nice #include s it provides. You have two choices - writing your own (not recommended, it's one big chunk of work all in its own), or porting an existing one.

You have to realize that a standard library needs to call the kernel in many places. While malloc() can pass out chunks of memory it manages internally, it has to call upon the kernel to actually get any memory it can manage. printf() sure can format your output, but it still has to call upon the kernel to actually print that output anywhere. fopen() does all the C-specific file management for you, but it has to call upon the kernel to actually provide access to a file. And so it goes on and on.

Well-known available C libraries are the GNU libc and newlib. Newlib is most likely the easier one to port. Another alternative would be building on PDPCLIB, or helping [[User:MartinBaute|MartinBaute] with the PDCLib (the latter one being aimed at maximum ease of portability for the very purpose of hobbyist OS development).

Once you have done that, go back to the above how-to, and recompile your GCC environment using the --with-headers option to tell it where to find the headers of your C library. You now have access to the standard library, but you still can't compile standalone executables, since you are still without a C (C++) runtime.

...to be extended.

Step 3 - Full Cross-Compiler

In the third step, we will build a "complete" cross-compiler, which can create not only object files, but standalone executables. For that, you will need a C (C++) runtime that sets up the process environment for an executable, i.e. all the stuff that happens before int main(). This is highly platform-specific, so we can only give an overview of the steps involved.

...to be extended.

Step 4 - Native Compiler

In the fourth step, we will "bootstrap" a native GCC, that not only compiles for the target, but also runs on the target. This is basically the ticket for leaving your host OS behind, and doing all your development work on your own OS! (Of course, you need an editor and a shell environment to actually enjoy this, but you sure did that first thing after finishing Step 3 above, didn't you...?)

...to be extended. This is the only place where --disable-shared really makes sense.

x86_64

Creating a cross-compiler to a different CPU is harder than just targeting a different binary format. To add insult to injury, x86_64 is a special case since it does not support a plain ELF target (yet?). Binutils works fine, but GCC causes problems when trying to build it as --target=x86_64-elf, so we have to trick our way around it.

Patches for the toolchain are available on sourceware and [from http://gcc.gnu.org/ml/gcc-patches/2005-03/msg01286.html the GCC website]. The binutils patch (on SourceWare) is applied by default in 2.17, so if you use 2.17 you don't need that patch. The GCC patch isn't applied in any publically released version (as of 4.1.1 and lower that is), so you'll probably need that.

  • note - you are very strongly encouraged to use these patches. You might need to modify the sources instead (but then again, if you're making a compiler you should be able to).

Lots of thanks go to Mikkel Krautz for creating & submitting these patches.

Preparation

With patches:

   mkdir /usr/cross
   export PREFIX=/usr/cross
   export TARGET=x86_64-pc-elf
   cd /usr/src
   mkdir build-binutils build-gcc

Without patches:

   mkdir /usr/cross
   export PREFIX=/usr/cross
   export TARGET=x86_64-pc-linux
   cd /usr/src
   mkdir build-binutils build-gcc

This is the same as in the generic how-to above, save for the different TARGET.

Binutils

   cd /usr/src/build-binutils
   ../binutils-x.xx/configure --target=$TARGET --prefix=$PREFIX --enable-64-bit-bfd
   make all install

Compiling binutils is again nearly identical; note however the additional option --enable-64-bit-bfd.

The BFD library is the method all binutils programs use to access files, and it grows considerably larger and slower if you add 64-bit support to it. Since there are very few systems that use 64-bit, they don't want to add it by default, and for some reason they don't add it depending on the host, so you have to add it manually. Note, the binutils do not actually do anything with the target, they work with binary files, not code. In the binary files special codes are used to specify what needs to be done, in a way that the code doesn't have to be understood or in most cases even read.

GCC

Compiling GCC with patches is pretty much equal to the normal FAQ. Without the patches, however, compiling requires use of -k and ignoring errors. The compiler stubbornly tries to compile some libraries that require a normal environment (such as linux) that you don't have (or haven't specified).

Both with and without patches you'll use all-gcc and install-gcc. The installer tries to make a few libraries if you don't and will fail at this. When your platform gets to a reasonable size of userbase support, you can try to use plain all and install instead, for a more complete GCC environment. With patches:

   cd /usr/src/build-gcc
   export PATH=$PATH:$PREFIX/bin
   ../gcc-x.x.x/configure --target=$TARGET --prefix=$PREFIX \
       --disable-nls --enable-languages=c,c++ --with-newlib --without-headers
   make all-gcc install-gcc

Without patches:

   cd /usr/src/build-gcc
   export PATH=$PATH:$PREFIX/bin
   ../gcc-x.x.x/configure --target=$TARGET --prefix=$PREFIX \
       --disable-nls --enable-languages=c,c++ --with-newlib --without-headers
   make -k all-gcc install-gcc

Since we could not select a plain ELF target, GCC stubbornly insists on compiling some OS dependent libraries, which of course does not work because we don't have them in our OS. The use of the -k option to make forces the build process to continue nevertheless, ignoring the errors. So far, this seems to work.

Other Things

--disable-shared

This is an option to the GCC configure process that you might or might not use. It's a bit tricky... with --disable-shared...

...every single executable in your cross-compiler environment will be statically linked, which makes them huge. On the other hand, you can move around those executables, as there are no dependencies (which is what you want in step 4 above, to "bootstrap" your first native compiler). However, if you compiled any of the libraries that are linked in with a more advanced -march, the resulting executable is going to be only for that arch or better. This is usually no problem - unless you're doing a Canadian Cross where you might want to put the executable on a slower (less-featured) machine. Bad luck... without --disable-shared...

...the executables will use dynamic linking, and will be much smaller. Now comes the tricky part: Not only the system libraries are dynamically linked, but also those implementing generic functionality of the cross-compilation environment. This is done by hard-coding the library paths into the executables... meaning that you can not move them to another directory. This is usually not necessary - unless you're doing a Canadian Cross where you might want to build the executables in a directory different from the one you want to install them in (e.g. due to access limitations on the build machine). Bad luck...

For a Canadian Cross, the best idea is not to use the option, and to copy the file to the exact same directory. Related Stuff

http://kegel.com/crosstool has a popular example of a script that automatically downloads, patches, and builds binutils, gcc, and glibc for known platforms.

http://crossgcc.billgatliff.com/ is Bill Gatliff's CrossGCC FAQ. <This link appears to be dead>

http://forums.gentoo.org/viewtopic.php?t=66125 - Compiling Windows applications under Linux

http://www.libsdl.org/extras/win32/cross/README.txt - dito

Canadian Cross - making things yet more complicated.

Authors

The entire default procedure and the moderation of the page is done by MartinBaute, while DasCandy takes care of the Canadian Cross and x86-64 (amd64) sections.