Why do I need a Cross Compiler?: Difference between revisions

m
→‎Articles: add zig cc
[unchecked revision][unchecked revision]
m (→‎Articles: add zig cc)
 
(13 intermediate revisions by 9 users not shown)
Line 1:
::''Notice: ''This page is specific to GCC. If you use another compiler, you should research how cross-compilation is normally done with that compiler and do it that way. GCC is quite tightly bound to its native target system, many other compilers are not. Some compilers don't even have a native target, they are always cross-compilers.
You need to use a [[GCC_Cross-Compiler|cross-compiler]] ''unless'' you are developing on your own operating system. The compiler ''must'' know the correct [[Target Triplet|target platform]] (CPU, operating system), otherwise you will run into trouble. You may be able to use the compiler that comes with your system if you pass a number of options to beat it into submission, but this will create a lot of completely imaginary problems.
 
You need to use a [[GCC_Cross-Compiler|cross-compiler]] ''unless'' you are developing on your own operating system. The compiler ''must'' know the correct [[Target Triplet|target platform]] (CPU, operating system), otherwise you will run into trouble. You may be able to use the compiler that comes with your system if you pass a number of options to beat it into submission, but this will create a lot of completely imaginaryunnecessary problems.
 
It is possible ask your compiler what target platform it is currently using by calling the command:
 
<sourcesyntaxhighlight lang="bash">
gcc -dumpmachine
</syntaxhighlight>
</source>
 
If you are developing on 64-bit Linux, then you will get a response such as 'x86_64-unknown-linux-gnu'. This means that the compiler thinks it is creating code for Linux. If you use this gccGCC to build your kernel, it will use your system libraries, headers, the Linux [[libgcc]], and it will make a lot of problematic Linux assumptions. If you use a [[GCC_Cross-Compiler|cross-compiler]] such as i585i686-elf-gcc, then you get a response back such as 'i686-elf' that means the compiler knows it is doing something else and you can avoid a lot of problems easily and properly.
 
== How to build a Cross-Compiler ==
{{Main|GCC Cross Compiler}}
 
It is easy and takes a few moments to [[GCC_Cross-Compiler|build a cross-compiler]] that targets your operating system. It may take a while to build it on slower computers, but you only need to do it once, and you save all the time you would otherwise spend on "fixing" the completely imaginaryunnecessary problems you would encounter otherwise. Later on, when you start building a user-space for your operating system, it is worth creating an [[OS_Specific_Toolchain|OS Specific Toolchain]] for absolute control of the compiler and to easy compiling user-space programs.
 
== Transitioning to a Cross-Compiler ==
Perhaps you have not been using a [[GCC_Cross-Compiler|cross-compiler]] until now, in which case you are likely doing a lot of things wrong. Unfortunately, a lot of kernel tutorials suggest passing certain options and doing things in a manner that potentially causes a lot of trouble. This section documents some of the things you should watch out for. Please read this section carefully and point others to it if you see them using troublesome options.
 
=== Linking with your compiler rather than ld ===
You shouldn't be invoking ld directly. Your cross-compiler is able to work as a linker and using it doesas athe lotlinker ofallows reallyit nicecontrol stuffat the linking stage. This control includes expanding the <tt>-lgcc</tt> to yourthe full path of [[libgcc]] objectthat filesonly beforethe callingcompiler ldknows itselfabout. If you get weird errors during compilation, use your cross-compiler for linking and it may go away. If you do need ld, be sure to use the cross-linker (i686-elf-ld) rather than the system linker.
 
=== Using cross-tools ===
You get a lot of useful programs when you build your cross-binutils. For instance, you get i686-elf-readelf, i686-elf-as, i686-elf-objdump, i686-elf-objcopy, and more. These programs know about your operating system and handle everything correctly. You can use some of the programs that come with your local operating system instead (readelf, objcopy, objdump) if they know about the file format of your operating system, but it is in general best to use your cross tools instead. These tools all consistently have the prefix 'i686-elf-' if the platform of your OS is i686-elf.
 
=== Options that you should pass to your Compiler ===
You need to pass some special options to your compiler to tell it it isn't building user-space programs.
 
==== -ffreestanding ====
This is important as it lets the compiler know it is building a kernel rather than user-space problemprogram. The documentation for GCC says you are required to implement the functions memset, memcpy, memcmp and memmove yourself in freestanding mode.
 
==== -mno-red-zone (x86_64 only) ====
=== -nostdlib (same as both -nostartfiles -nodefaultlibs) ===
You need to pass this on x86_64 or interrupts will corrupt the stack. The red zone is a x86_64 ABI feature that means that signals happen 128 bytes further down the stack. Functions that use less than that amount of memory is allowed to not increment the stack pointer. This means that CPU interrupts in the kernel will corrupt the stack. Be sure to pass enable this for all x86_64 kernel code.
The -nostdlib option is the same as passing both the -nostartfiles -nodefaultlibs options. You don't want the start files (crt0.o, crti.o, crtn.o) in the kernel as they only used for user-space programs. You won't want the default libraries such as libc, because the user-space versions are not suitable for kernel use. You should only pass -nostdlib, as it is the same as passing the two latter options that you can then remove.
 
==== -lgccfno-exceptions, -fno-rtti (C++) ====
It is wise to disableddisable C++ features that doesndon't work out-of-the-box in kernels. You need to supply a C++ support library to the kernel (in addition to libgcc) to make all C++ features work. If you don't use these C++ features, it should be sufficient to pass these options.
You disable the important [[libgcc]] library when you pass -nodefaultlibs (implied by -nostdlib). The compiler needs this library for many operations that it cannot do itself or that is more efficient to put into a shared function. You must pass this library near the end of the link line, after all the other object files and libraries, or the linker won't use it and you get strange linker errors.
 
=== -mno-red-zoneOptions (x86_64you only)should link with ===
These options only make sense when linking (not when compiling) and you should use them. You should pass the compilation options as well when linking, as some compilation options (such as <tt>-mno-red-zone</tt>) control the ABI and this needs to be known at link time as well.
You need to pass this on x86_64 or interrupts will corrupt the stack. The red zone is a x86_64 ABI feature that means that signals happen 128 bytes further down the stack. Functions that use less than that amount of memory is allowed to not increment the stack pointer. This means that CPU interrupts in the kernel will corrupt the stack. Be sure to pass enable this for all x86_64 kernel code.
 
==== -nostdlib (same as both -nostartfiles -nodefaultlibs) ====
The -nostdlib option is the same as passing both the -nostartfiles -nodefaultlibs options. You don't want the start files (crt0.o, crti.o, crtn.o) in the kernel as they only used for user-space programs. You wondon't want the default libraries such as libc, because the user-space versions are not suitable for kernel use. You should only pass -nostdlib, as it is the same as passing the two latter options that you can then remove.
 
==== -fno-exceptions, -fno-rtti (C++)lgcc ====
You disable the important [[libgcc]] library when you pass -nodefaultlibs (implied by -nostdlib). The compiler needs this library for many operations that it cannot do itself or that is more efficient to put into a shared function. You must pass this library nearat the end of the link line, after all the other object files and libraries, or the linker won't use it and you get strange linker errors. This is due to the classic static linking model where an object file from a static library is only pulled in if it is used by a previous object file. Linking with [[libgcc]] must come after all the object files that might use it.
It is wise to disabled C++ features that doesn't work out-of-the-box in kernels. You need to supply a C++ support library to the kernel (in addition to libgcc) to make all C++ features work. If you don't use these C++ features, it should be sufficient to pass these options.
 
=== Options that you shouldn't pass to your Compiler ===
There is a number of options you normally shouldn't pass to your cross-compler when building a kernel. Unfortunately, a lot of kernel tutorials suggest you use these. Please do not pass a option without understanding why it is needed and don't suggest to people that they use them. Often, these options are used by those that don't use cross-compilers to cover up other problems.
 
==== -m32, -m64 (compiler) ====
If you build a cross-compiler such as i686-elf-gcc, then you don't need to tell it to make a 32-bit executable. Likewise, you don't need to pass -m64 to x86_64-elf-gcc. This will make your Makefiles much simpler as you can simply select the correct compiler and things will work. You can use x86_64-elf-gcc to build a 32-bit kernel, but it's much easier to just build two cross-compilers and use them. In addition, using a cross-compiler for every CPU you target will make it easy to port third-party software without tricking itthem into passing -m32 as well.
 
==== -melf_i386, -melf_x86_64 (linker) ====
You don't need to pass these for the same reason as -m32 and -m64. Additionally, these options are for ld, and you shouldn't be invoking ld directly in the first place, but rather linking with your cross-compiler.
 
==== -32, -64 (assembler) ====
The cross-assembler (i686-elf-as) defaults to the platform you specified when building binutils, and so you don't need to repeat the choice here. You can use the cross-compiler as an assembler, but it is okay to call the assembler directly.
 
==== -nostdinc ====
You shouldn't pass this option as it disables the standard header include directories. However, you do want to use these headers as they contain many useful declarations. The cross-compiler comes with a bunch of useful headers such as stddef.h, stdint.h, stdarg.h, and more.
 
If you don't use a cross-compiler, you get the headers for your host platform (such as Linux) which are unsuitable for your operating system. For that reason, most people that don't use a cross-compiler use this option and then have to reimplement stddef.h, stdint.h, stdarg.h and more themselves. People often implement those files incorrectly as you need compiler magic to implement features such as stdarg.h.
 
==== -fno-builtin ====
YouThis shouldn'toption passis thisimplied optionby as<tt>-ffreestanding</tt> and there is no reason to pass it disablesyourself. defaultThe compiler defaults to -fbuiltin that enables builtins, but -fno-builtin disables them. Builtins mean that the compiler knows about standard features and can optimize their use. If the compiler sees a function called 'strlen', it normally assumes it is the C standard 'strlen' function and it is able to optimize the expression strlen("foo") into 3 at compile time, instead of calling the function. This option has value if you are creating some really non-standard environment in which common C functions don't have their usual semantics. It is possible to enable builtins again with <tt>-fbuiltin</tt> following <tt>-ffreestanding</tt> but this can lead to surprising problems down the road, such as the implementation of calloc (malloc + memset) being optimized into a call to calloc itself.
 
==== -fno-stack-protector ====
The [[Stack Smashing Protector]] is a feature that stores a random value on the stack of selected functions and verifies the value is intact upon return. This statistically prevents stack buffer overflows overwriting the return pointer on the stack, which would subvert control flow. Adversaries are often able to exploit such faults, and this feature requires the adversary to correctly guess a 32-bit value (32-bit systems) or a 64-bit value (64-bit systems). This security feature requires [[Stack Smashing Protector|runtime support]]. Compilers from many operating system vendors enable this feature by having <tt>-fstack-protector</tt> be the default. This breaks kernels that don't use a cross-compiler, if they don't have the runtime support. Cross-compilers such as the <tt>*-elf</tt> targets have the stack protector disabled by default and there's no reason to disable it yourself. You may want to change the default to enabling it when you add support for it to your kernel (and user-space), which would make it automatically used by your kernel because you didn't pass this option.
I have seen a lot of newbies pass this option. Upon closer examination, I see no reason to pass it, but I'm not entirely sure, perhaps it's because it requires libgcc? I'd leave the option out unless you actually need it. Please correct this description if you know more.
 
== Problems that occur without a Cross-Compiler ==
You need to overcome a lot of imaginary problems to use your system gcc to build your kernel. You don't need to deal with these problems if you use a cross-compiler.
 
=== More complicated compilation commands ===
The compiler assumes it is targetting your local system, so you need a lot of options to make it behave. A trimmed down command sequence for compiling a kernel without a cross-compiler could look like this:
 
<sourcesyntaxhighlight lang="bash">
as -32 boot.s -o boot.o
gcc -m32 kernel.c -o kernel.o -ffreestanding -nostdinc
gcc -m32 my-libgcc-reimplemenation.c -o my-libgcc-reimplemenation.o -ffreestanding
gcc -m32 -T link.ld boot.o kernel.o my-libgcc-reimplemenation.o -o kernel.bin -nostdlib -ffreestanding
</syntaxhighlight>
</source>
 
Actually, the average case is worse. People tend to add many more problematic or redundant options. With a real cross-compiler, the command sequence could look this this:
 
<sourcesyntaxhighlight lang="bash">
i686-elf-as boot.s -o boot.o
i686-elf-gcc kernel.c -o kernel.o -ffreestanding
i686-elf-gcc -T link.ld boot.o kernel.o -o kernel.bin -nostdlib -ffreestanding -lgcc
</syntaxhighlight>
</source>
 
=== Reimplementing libgcc ===
Line 92 ⟶ 97:
 
=== Complicated compiling user-space programs ===
You need to pass even more options to the command lines that build programs for your operating systems. You need a -Ipath/to/myos/include and -Lpath/to/myos/lib to use the C library, and more. If you set up an [[OS-specific toolchainSpecific Toolchain]], you just need
 
<sourcesyntaxhighlight lang="bash">
i686-myos-gcc hello.c -o hello
</syntaxhighlight>
</source>
 
to cross-compile the hello world program to your operating system.
Line 107 ⟶ 112:
 
=== And so on ===
As the project grows in size, it becomes much more complicated to maintain your operating system without a real cross-compiler. Even if your ABI is very much like Linux, your operating system isn't Linux. Porting third party software is near impossible without a cross-compiler. If you set up a real [[OS-specific toolchainSpecific Toolchain]] and a [[sysroot]] of your OS, you can compile software just by giving --host=i686-myos to ./configure. With a cross-compiler you can [[Cross-Porting Software|port software]] in the standard manner.
 
== Background information ==
 
=== Where did the idea of cross compiling come from?===
Line 117 ⟶ 122:
=== What are the basics of cross compiling? ===
 
The "build" machine is the machine you're compiling the software on. This software being compiled may be compiled to run on some other type of machine. See, you may be building on an x86-based machine, and wishing for the software to run on a SPARC based machine. The build machine is implicit and will usually be auto-detected by the configure script for the software. Its only real purpose is so that, if the software being compiled chooses to keep the configure arguments used to configure it somewhere in the built package, the people to whom the package is distributed will know what machine the package was built on. The name of the build machine may be used to configure tothe package to use workarounds as well if the build machine is universally known to have certain problems building that software.
 
The "host" machine is the machine on which the software must run. So in the previous example, the "build" machine is an i686-elf-yourBuildOs machine, and the host is a sparc32-elf-unix4 machine.
Line 163 ⟶ 168:
The person who built the compiler that runs on my machine built it on a machine just like mine, an 'i486-linux-gnu' (build), and intended for it to run on a machine like his/mine: the same i486-linux-gnu (host), and he meant for this compiler he was building for me, to emit executables that targeted my very machine, so that when I compile programs, they will be able to run on a host that is i486-linux-gnu. Therefore he made the compiler target i486-linux-gnu.
 
== See Also ==
 
=== Articles ===
* [[GCC Cross-Compiler]]
* [[Cross-Porting Software]]
* [["zig cc" Cross-Compiler]]
 
[[Category:Compilers]]
[[Category:FAQ]]
[[Category:OS_Development]]