OS Specific Toolchain: Difference between revisions

change categorization
[unchecked revision][unchecked revision]
m (People building an OS specific toolchain know to build with --disable-nls themselves if needed)
(change categorization)
 
(44 intermediate revisions by 21 users not shown)
Line 1:
{{Rating|3}}
 
This tutorial will guide you through creating a toolchain comprising Binutils and GCC that specifically targets your operating system. The instructions below teach Binutils and GCC how to create programs for a hypothetical OS named 'MyOS'.
This tutorial will walk you through creating a toolchain comprising Binutils, GCC (with cc1 and [[libgcc]]), newlib and optionally cc1plus and libstdc++ (with libsupc++) that specifically targets your OS. In this tutorial we will assume you are using an i586 compatible PC for a hypothetical OS named 'MyOS' and so the GNU canonical name we will be targeting is 'i586-pc-myos'. It should not, however, be difficult to adapt this tutorial for other systems, e.g. x86-64. This tutorial will produce a setup with not much different functionality from those produced by the [[GCC Cross-Compiler]] and [[Porting Newlib]] tutorials (aside from the name). It is merely intended to demonstrate the principles behind adding a new target to these packages and provide you with a framework for incorporating your ownOS-specific modifications. You will, however, get a newlib that does not require libgloss (linking will work 'out of the box').
 
Until now you have been using a [[GCC Cross-Compiler|cross-compiler]] configured to use an existing generic bare target. This is very convenient when starting out as you get a reliable target and the compiler doesn't make any bad assumptions because it thinks it is targeting an existing operating system. However, when you proceed it becomes useful if the compiler knows it is targeting your operating system and what its customs are. For instance, you can make the compiler define a <tt>__myos__</tt> preprocessor macro, know which directories to search for include files in, what special <tt>crt*.o</tt> files are used when linking against libc, and so on. It also becomes much easier to cross-compile software to your OS when you simply have to invoke <tt>x86_64-myos-gcc hello.c -o hello</tt> to cross-compile a program. Additionally part of the instructions here can be applied to other software packages that also use the GNU build system, which will help you [[Cross-Porting Software|port existing software]].
= Prerequisites =
* The latest binutils, gcc-core and newlib packages, and optionally gcc-g++. This tutorial was developed with binutils-2.18, gcc-4.2.1 and newlib-1.15.0, and has recently been tested on binutils-2.22 and gcc-4.7.0, and again on binutils-2.23 and gcc-4.7.2 using newlib-2.0.0.
* A build environment that can successfully build a [[GCC Cross-Compiler]]. This tutorial was developed with Cygwin and Ubuntu 12.04.
* autoconf, automake, autoreconf, libtool.
* The OS you are targeting should be able to handle syscalls, with a well defined interface.
* Knowledge of the internals of binutils, gcc and newlib, along with a working knowledge of autoconf and automake.
* Depending on your version of gcc, you may also need gmp, mpfr and mpc.
 
This tutorial teaches you how to set up a cross-compiler that specifically targets your OS. This is actually the first step of ''porting Binutils and GCC to your operating system'': Any information you give GCC about your OS will help it run on your OS. Once your OS Specific Toolchain has been set up and you have built your OS with it, you can continue by using the cross-compiler to cross-compile the compiler itself to your OS, assuming your libc and kernel is powerful enough. For more information see [[Porting GCC to your OS]].
= Preparation =
 
Extract the source tarballs into your /usr/src directory.
== Introduction ==
Set up the TARGET and PREFIX environment variables:
 
export TARGET=i586-pc-myos
You need the following before you get started:
export PREFIX=/usr/local/cross
 
Create the build directories:
* A build environment that can successfully build a [[GCC Cross-Compiler]].
mkdir build-binutils build-gcc build-newlib
* autoconf (exactly version 2.69)
* automake (exactly version 1.15.1)
* libtool
* The latest Binutils source code (2.39 at time of writing).
* The latest GCC source code (12.2.0 at time of writing).
* Knowledge of the internals of Binutils and GCC.
* Knowledge of autoconf and automake.
* The dependencies of Binutils and GCC as detailed in [[GCC Cross-Compiler]].
 
If you're compiling a later version of GCC/binutils, find versions of autoconf and automake that were released just prior to your version of GCC/binutils. See the [[Cross-Compiler Successful Builds]] page for known compatible versions of GCC and binutils.
 
Additionally you will need a [[C Library]] as described in a later section. As detailed in [[Hosted GCC Cross-Compiler]], it doesn't need to support much and the functionality can be stubbed, but libgcc will need to believe you have a libc.
 
You should decide exactly what targets you'll add to Binutils and GCC. If you have been using a generic <tt>i686-elf</tt> or <tt>x86_64-elf</tt> or such target, you'll simply want to swap <tt>-elf</tt> with <tt>-myos</tt> and get <tt>i686-myos</tt> and <tt>x86_64-myos</tt>. Naturally, don't actually write myos, but rather use the name of your OS converted to lower case. See [[Target Triplet]].
 
This tutorial currently only have instructions for adding a new x86 and x86_64 target for myos, but it serves as a good enough example that it should be trivial to add more processors by basing it on these instructions and what other operating systems have done.
 
 
== Making your changes reproducible ==
Since at some point you might have a contributor that wants participate in your project, you likely want to make your toolchain setup reproducible. Instead of downloading the tar-balls, it is therefore a good idea to clone the repositories of binutils (https://sourceware.org/git/binutils-gdb.git) and gcc (https://gcc.gnu.org/git/gcc.git) locally.
 
All release versions are tagged within these repositories, so you can use <tt>"git checkout binutils-2_39"</tt> and <tt>"git checkout releases/gcc-12.2.0"</tt> to switch to the correct versions respectively. After making the described changes, you can easily create a patch using <tt>"git diff"</tt> which you can reuse later.
Another option would be to fork one of the mirror repositories on GitHub.
 
 
== Modifying Binutils ==
 
= Modify the source directories to add your target =
Now you can start adding your own target to each package. Some of the changes are quite similar between each package.
== Binutils ==
=== config.sub ===
 
This is a file you will modify in the same way for each package. It is a gnu standard file produced by including the line 'AC_CANONICAL_SYSTEM' in a configure.in that is processed by autoconf, and is designed to convert a canonical name of the form 'i586-pc-myos' into separate variables for the processor, vendor and OS, and also rejects systems it doesn't know about. We simply need to add 'myos' to the list of acceptable operating systems. Find the section that begins with the comment "First accept the basic system types" (it begins '-gnu*') and add '-myos*' to the list. I typically add it after -aos* simply because there is some free room on the line there.
This is a file you will modify in the same way for each package. It is a GNU standard file produced by including the line 'AC_CANONICAL_SYSTEM' in a configure.ac that is processed by autoconf, and is designed to convert a canonical name of the form <tt>i686-pc-myos</tt> into separate variables for the processor, vendor and OS, and also rejects systems it doesn't know about. We simply need to add 'myos' to the list of acceptable operating systems. Find the section that begins with the comment "<tt>Now accept the basic system types</tt>" (it begins '<tt>-gnu*</tt>') and add '<tt>-myos*</tt>' to the list. Find a line with some free room and add your entry there.
 
=== bfd/config.bfd ===
 
This file is part of the configuration for libbfd, the back-end to binutils which provides a consistent interface for many object file formats. Generally, each platform-specific version of binutils contains a libbfd which only supports the object files normally in use on that system, as otherwise the library would be massive (libbfd can support a _lot_ of object types). We need to associate our os with some particular object types. There is a long list starting 'WHEN ADDING ENTRIES TO THIS MATRIX' with the first line as 'case "${targ}" in'. We need to add our full canonical name to this list, by adding some cases such as:
This file is part of the configuration for libbfd, the back-end to Binutils which provides a consistent interface for many object file formats. Generally, each platform-specific version of Binutils contains a libbfd which only supports the object files normally in use on that system, as otherwise the library would be massive (libbfd can support a _lot_ of object types). We need to associate our os with some particular object types. There is a long list starting 'WHEN ADDING ENTRIES TO THIS MATRIX' with the first line as 'case "${targ}" in'. We need to add our full canonical name to this list, by adding some cases such as:
 
<syntaxhighlight lang="bash">
i[3-7]86-*-myos*)
targ_defvec=bfd_elf32_i386_veci386_elf32_vec
targ_selvecs=i386coff_vec
targ64_selvecs=x86_64_elf64_vec
;;
#ifdef BFD64
Be sure to follow the instructions in the comment block above the list and add your entry beneath the comment "#START OF targmatch.h". If you like, you could support different object formats (look at other entries in the list, and the contents of 'bfd' for hints) and also provide more than one to the 'targ_selvecs' line. The example here lets you deal with coff object files, but defaults to elf for object files and executables (similar to the i686-elf [[GCC Cross-Compiler]]).
x86_64-*-myos*)
targ_defvec=x86_64_elf64_vec
targ_selvecs=i386_elf32_vec
want64=true
;;
#endif
</syntaxhighlight>
 
'''Note:''' If using binutils-2.24 or older, change <i>i386_elf32_vec</i> to <i>bfd_elf32_i386_vec</i> and <i>x86_64_elf64_vec</i> to <i>bfd_elf64_x86_64_vec</i>.
 
Be sure to follow the instructions in the comment block above the list and add your entry beneath the comment "<tt>#START OF targmatch.h</tt>". If you like, you could support different object formats (look at other entries in the list, and the contents of 'bfd' for hints) and also provide more than one to the <tt>targ_selvecs</tt> line. For instance, you can support coff object files if you add <tt>i386coff_vec</tt> to the <tt>targ_selvecs</tt> list. All the <tt>x86_64</tt> entries in the file file are wrapped in <tt>#ifdef BFD64</tt>, as when targmatch.sed processes this file to turn it into to targmatch.h, it will add <tt>#ifdef BFD64</tt> so that the relevant code is only compiled when targeting a 64 bit platform
 
=== gas/configure.tgt ===
 
This file tells the gnu assembler what type of output to generate for each target. It automatically matches the i586 part of your target and generates the correct output for that. We just need to tell it what type of object file to generate for myos. In the section starting 'Assign object format ... case ${generic_target} in' you need to add a line like
This file tells the gnu assembler what type of output to generate for each target. It automatically matches the i686 part of your target and generates the correct output for that. We just need to tell it what type of object file to generate for myos. In the section starting '<tt>Assign object format ... case ${generic_target} in</tt>' you need to add a line like
 
<syntaxhighlight lang="bash">
i386-*-myos*) fmt=elf ;;
</syntaxhighlight>
You should use 'i386' in this line even if you are targeting x86_64. This is the only file where you should do it. It is basically because the variable 'generic_target' is not your canonical target name, but rather a variable generated further up in the configure.tgt file, and it sets the first part to i386 for any i[3-7]86 or x86_64.
 
You should use '<tt>i386</tt>' in this line even if you are targeting x86_64. This is the only file where you should do it. It is basically because the variable 'generic_target' is not your canonical target name, but rather a variable generated further up in the configure.tgt file, and it sets the first part to <tt>i386</tt> for any <tt>i[3-7]86</tt> or <tt>x86_64</tt>.
 
Note: this will use the 'generic' emulation. One side-effect is that gas will interpret slash ('/') as a comment, not as a division operator. This will break any code like "<tt>movl $(ADDRESS/PAGE_SIZE), %eax</tt>". Using "<tt>fmt=elf em=gnu ;;</tt>" or "<tt>fmt=elf em=linux ;;</tt>" will disable slash as a comment character.
 
=== ld/configure.tgt ===
This file tells the gnu linker what 'emulation' to use for each target. An emulation is basically a combination of linker script and executable file format. We are going to define our own emulation called myos_i386. We need to add an entry to the case statement here beginning 'Please try to keep this table in alphabetical order ... case "${targ}" in'
i[3-7]86-*-myos*) targ_emul=myos_i386 ;;
You can also add a targ_extra_emuls line to specify other targets ld should support, e.g. elf_i386, although this is untested. See the ld/configure.tgt file for examples.
 
This file tells the gnu linker what 'emulation' to use for each target. An emulation is basically a combination of linker script and executable file format. We are going to define our own emulation called <tt>elf_i386_myos</tt>. We need to add an entry to the case statement here after '<tt>Please try to keep this table more or less in alphabetic order ... case "${targ}</tt>" in':
=== ld/emulparams/myos_i386.sh ===
 
Now we need to actually define our emulation. There is a generic file called ld/genscripts.sh which creates the required linker scripts for our target (you need more than one, depending on shared object usage and the like: I have 13 for a single target). It uses a linker script template (from the ld/scripttempl directory) to do this, and it creates the actual emulation C file from an emulation template (from the ld/emultempl directory). These templates are customised by running a script in the ld/emulparams directory which sets various variables. You are welcome to define your own emulation and linker templates, but I find the ELF ones adequate, given that they can be customised by simply adding a file to the emulparams directory. This is what we are going to do now. The content of the file could be something like:
<syntaxhighlight lang="bash">
SCRIPT_NAME=elf
i[3-7]86-*-myos*)
(defines the scripttempl file to use, in this case the ELF one)
targ_emul=elf_i386_myos
OUTPUT_FORMAT=elf32-i386
targ_extra_emuls=elf_i386
(create our executables in elf32 format)
targ64_extra_emuls="elf_x86_64_myos elf_x86_64"
TEXT_START_ADDR=0x40000000
;;
(start of the .text section and therefore the entire executable image. I set to this because my kernel occupies 0x0-0x3fffffff)
x86_64-*-myos*)
MAXPAGESIZE="CONSTANT (MAXPAGESIZE)"
targ_emul=elf_x86_64_myos
COMMONPAGESIZE="CONSTANT (COMMONPAGESIZE)"
targ_extra_emuls="elf_i386_myos elf_x86_64 elf_i386"
(Tell ld the page sizes so it can properly align sections to page boundaries. Just use the defaults here)
;;
ARCH=i386
</syntaxhighlight>
MACHINE=
 
(self explanatory)
* '''elf_i386_myos''' is a 32-bit target for your OS.
NOP=0x90909090
* '''elf_x86_64_myos''' is a 64-bit target for your OS.
(what ld pads sections with, basically the i386 NOP instruction 4 times to fill a 32-bit value)
* '''elf_i386''' is a bare 32-bit target as you had with <tt>i686-elf</tt>.
TEMPLATE_NAME=elf32
* '''elf_x86_64''' is a bare 64-bit target for your OS as you had with <tt>x86_64-elf</tt>.
(defines the emultempl file to use, again, stick with ELF)
 
GENERATE_SHLIB_SCRIPT=yes
This setup provides you with a 32-bit toolchain that also can produce 64-bit executables, and a 64-bit toolchain that can also produce 64-bit executables. This comes in handy if you use objcopy, for instance. You can also add <tt>targ_extra_emuls</tt> entries to specify other targets ld should support. See the <tt>ld/configure.tgt</tt> file for examples.
GENERATE_PIE_SCRIPT=yes
 
(should we generate linker scripts to produced shared libraries and position-independent executables respectively)
=== ld/emulparams/elf_i386_myos.sh ===
NO_SMALL_DATA=yes
 
SEPARATE_GOTPLT=12
Now we need to actually define our emulation. There is a generic file called <tt>ld/genscripts.sh</tt> which creates the required linker scripts for our target (you need more than one, depending on shared object usage and the like: I have 13 for a single target). It uses a linker script template (from the ld/scripttempl directory) to do this, and it creates the actual emulation C file from an emulation template (from the ld/emultempl directory). These templates are customised by running a script in the ld/emulparams directory which sets various variables. You are welcome to define your own emulation and linker templates, but I find the ELF ones adequate, given that they can be customised by simply adding a file to the emulparams directory. This is what we are going to do now. The content of the file could be something like:
(unsure as to these, just copied from the ELF one, seems to work okay)
 
<syntaxhighlight lang="bash">
source_sh ${srcdir}/emulparams/elf_i386.sh
TEXT_START_ADDR=0x08000000
</syntaxhighlight>
 
This script is included by <tt>ld/genscripts.sh</tt> to customize its behavior through shell variables. We include the base <tt>elf_i386.sh</tt> script as it sets reasonable defaults. Finally, we override the variables whose defaults we disagree with.
 
There are a large number of variables that can be set here to customize your toolchain. Read the documentation and look at existing emulations for further information. These are some of the variables that can be set:
 
* '''GENERATE_SHLIB_SCRIPT'''=''yes|no'' Whether to generate a linker script for shared libraries. We enable this as you might want it later.
* '''GENERATE_PIE_SCRIPT'''=''yes|no'' Whether to generate a linker script for position independent executables. We enable this as you might want it later.
* '''SCRIPT_NAME'''=''name'' Controls which ld/scripttempl/''name''.sc script generates our linker scripts.
* '''TEMPLATE_NAME'''=''name'' Controls which ld/emultempl/''name''.em script generates our bfd emulation C implementation.
* '''OUTPUT_FORMAT'''=''name'' The name of the BFD output target we use.
* '''TEXT_START_ADDR'''=0x''value'' Controls where the executable begins in memory.
 
You can read the base <tt>elf_i386.sh</tt> script for the defaults of these variables, you can then decide for yourself if you wish to override them for your operating system.
 
=== ld/emulparams/elf_x86_64_myos.sh ===
 
This file is just like the above <tt>ld/emulparams/elf_i386_myos.sh</tt> but for x86_64.
 
<syntaxhighlight lang="bash">
source_sh ${srcdir}/emulparams/elf_x86_64.sh
</syntaxhighlight>
 
=== ld/Makefile.am ===
 
We now just need to tell make how to produce the emulation C file for our specific emulation. Putting the '<tt>targ_emul=elf_i386_myos</tt>' line into <tt>ld/configure.tgt</tt> above implies that your host linker will try to link your target ld executable with an object file called <tt>eelf_i386_myos.o</tt>. There is a default rule to generate this from <tt>eelf_i386_myos.c</tt>, so we just need to tell it how to make this <tt>eelf_i386_myos.c</tt> file. As stated above, we let the genscripts.sh file do the hard work. You need to add <tt>eelf_i386_myos.c</tt> to the <tt>ALL_EMULATION_SOURCES</tt> list; you also need to add <tt>eelf_x86_64_myos.c</tt> to the <tt>ALL_64_EMULATION_SOURCES</tt> list if applicable.
 
'''Note''': You ''must'' run <tt>automake</tt> in the <tt>ld</tt> directory after you modify <tt>Makefile.am</tt> to regenerate <tt>Makefile.in</tt>.
=== ld/Makefile.in ===
We now just need to tell make how to produce the emulation C file for our specific emulation. Putting the 'targ_emul=myos_i386' line into ld/configure.tgt above implies that your host linker will try to link your target ld executable with an object file called emyos_i386.o. There is a default rule to generate this from emyos_i386.c, so we just need to tell it how to make this emyos_i386.c file. As stated above, we let the genscripts.sh file do the hard work. You need to add a makefile rule (I add it after the one to build eelf_i386.c) similar to:
emyos_i386.c: $(srcdir)/emulparams/myos_i386.sh $(ELF_DEPS) $(srcdir)/scripttempl/elf.sc ${GEN_DEPENDS}
${GENSCRIPTS} myos_i386 "$(tdir_myos_i386)"
Note that some parts of the line use normal brackets () whereas other parts use curly braces {}. Also, the second line is indented with a tab, not spaces.
You can also add 'emyos_i386.o' to the dependencies of 'ALL_EMULATIONS' if you like, but it is not essential.
 
== Modifying GCC ==
'''Note:''' The second line must start with single ''tab'', not spaces, because this is a [[Makefile]].
 
== GCC ==
=== config.sub ===
 
Similar modification to config.sub in binutils.
Similar modification to config.sub in Binutils.
 
=== gcc/config.gcc ===
 
This file defines what needs to be built for each particular target and what to include in the final executable. There are two main sections: one which defines generic options for your operating system, and those which define options specific to your operating system on each individual machine type. For the first part, find the section starting 'Common parts for widely ported systems ... case ${target} in' and add something like:
This file defines what needs to be built for each particular target and what to include in the final executable. There are two main sections: one which defines generic options for your operating system, and those which define options specific to your operating system on each individual machine type.
*-*-myos*)
 
extra_parts="crtbegin.o crtend.o"
For the first part, find the '<tt>case ${target} in</tt>' line just after '<tt># Common parts for widely ported systems</tt>' (around line 688) and add something like:
gas=yes
 
gnu_ld=yes
<syntaxhighlight lang="bash">
default_use_cxa_atexit=yes
*-*-myos*)
gas=yes
gnu_ld=yes
default_use_cxa_atexit=yes
use_gcc_stdint=provide
;;
</syntaxhighlight>
 
* '''gas=yes''' our operating system by default uses the GNU assembler
* '''gnu_ld=yes''' out operating system by default uses the GNU linker
* '''default_use_cxa_atexit=yes''' We will provide ''__cxa_atexit'' (You will need to provide this in your standard library)
* '''use_gcc_stdint=provide''' This instructs gcc to provide you with a ''stdint.h'' appropiate for your target. Change <tt>provide</tt> to <tt>wrap</tt> if you have your own ''stdint.h'', to make GCC wrap yours.
 
The second section we need to add to is the architecture-specific one. Find the '<tt>case ${target} in</tt>' line just before '<tt>tm_file="${tm_file} elfos.h newlib-stdint.h"</tt>' (around line 1094) and add something like:
 
<syntaxhighlight lang="bash">
i[34567]86-*-myos*)
tm_file="${tm_file} i386/unix.h i386/att.h elfos.h glibc-stdint.h i386/i386elf.h myos.h"
;;
x86_64-*-myos*)
This basically says: build the static versions of crtbegin and crtend (important for various parts of [[libgcc]]), our operating system by default uses the GNU linker and assembler and that we will provide __cxa_atexit (newlib will actually provide it).
tm_file="${tm_file} i386/unix.h i386/att.h elfos.h glibc-stdint.h i386/i386elf.h i386/x86-64.h myos.h"
The second section we need to add to is the architecture-specific one. Find the section starting 'case ${target} in ... Support site-specific machine types' and add
i[3-7]86-*-myos*)
tm_file="${tm_file} i386/unix.h i386/att.h dbxelf.h elfos.h i386/i386elf.h myos.h"
tmake_file="i386/t-i386elf t-svr4"
use_fixproto=yes
;;
</syntaxhighlight>
Which defines some extra include files and makefile fragments to use. In GCC 4.7.0, the text 'Support site-specific maching types' does not exist in this file, but you still need to edit the appropriate table - it is the one immediately following the table to which you added the first part ('Common parts for widely ported systems'). We will create the myos.h in the next step.
 
This defines which target configuration header files gets used. You can make <tt>i386/myos32.h</tt> and <tt>i386/myos64.h</tt> files if desired.
 
=== gcc/config/myos.h ===
Now we create a header file which sets some OS-specific stuff in GCC. This currently just sets some variables which state that the C preprocessor should #define some macros, make some asserts and set the default system name when we are compiling an application for our OS. Something like the following should be fine.
<pre>
#undef TARGET_OS_CPP_BUILTINS
#define TARGET_OS_CPP_BUILTINS() \
do { \
builtin_define_std ("myos"); \
builtin_define_std ("unix"); \
builtin_assert ("system=myos"); \
builtin_assert ("system=unix"); \
} while(0);
 
This header allows you to customize your toolchain using preprocessor macros. The relevant parts of GCC will include this header (as controlled by <tt>gcc/config.gcc</tt>) and modify the behavior according to your customizations.
#undef TARGET_VERSION // note that adding these two lines cause an error in gcc-4.7.0
 
#define TARGET_VERSION fprintf(stderr, " (i386 myos)"); // the build process works fine without them until someone can work out an alternative
You can explore <tt>gcc/defaults.h</tt> for a full list of things you can modify, and more importantly, the assumptions GCC will make about various aspects of your target. For instance, if <tt>PID_TYPE</tt> is not defined in <tt>myos.h</tt>, then GCC will default it to <tt>int</tt>, which can be problematic if that's not what your <tt>pid_t</tt> is defined as.
</pre>
 
<syntaxhighlight lang="C">
/* Useful if you wish to make target-specific GCC changes. */
#undef TARGET_MYOS
#define TARGET_MYOS 1
 
/* Default arguments you want when running your
i686-myos-gcc/x86_64-myos-gcc toolchain */
#undef LIB_SPEC
#define LIB_SPEC "-lc" /* link against C standard library */
 
/* Files that are linked before user code.
The %s tells GCC to look for these files in the library directory. */
#undef STARTFILE_SPEC
#define STARTFILE_SPEC "crt0.o%s crti.o%s crtbegin.o%s"
 
/* Files that are linked after user code. */
#undef ENDFILE_SPEC
#define ENDFILE_SPEC "crtend.o%s crtn.o%s"
 
/* Additional predefined macros. */
#undef TARGET_OS_CPP_BUILTINS
#define TARGET_OS_CPP_BUILTINS() \
do { \
builtin_define ("__myos__"); \
builtin_define ("__unix__"); \
builtin_assert ("system=myos"); \
builtin_assert ("system=unix"); \
builtin_assert ("system=posix"); \
} while(0);
</syntaxhighlight>
 
=== libstdc++-v3/crossconfig.m4 ===
 
Only necessary if you are compiling the GNU C++ compiler AND you wish to cross-compile the standard C++ library for your os. You can experiment with adding more options (ala the Linux case) if you wish. Add a case similar to
This file describes how the libstdc++ configure file will examine your operating system and adjust the provided features of libstdc++ accordingly. Add a case similar to
 
<syntaxhighlight lang="bash">
*-myos*)
GLIBCXX_CHECK_COMPILER_FEATURES
AC_CHECK_HEADERS([sys/types.h locale.h float.h])
GLIBCXX_CHECK_LINKER_FEATURES
GLIBCXX_CHECK_BUILTIN_MATH_SUPPORT
GLIBCXX_CHECK_MATH_SUPPORT
GLIBCXX_CHECK_COMPLEX_MATH_SUPPORT
GLIBCXX_CHECK_STDLIB_SUPPORT
;;
</syntaxhighlight>
 
'''Note''': You need to run '<tt>autoconf'</tt> in the <tt>libstdc++-v3</tt> directory if you change this file.'''
 
=== libgcc/config.host ===
==== Older GCC versions ====
This file defines special configuration for [[libgcc]] for different hosts. Find the section starting "# Support site-specific machine types.". Add something like this
i[3-7]86-*-myos*)
;;
Note that before the ";;" is a ''tab''. You need this change or else when [[libgcc]] is configured you'll get an error about the configuration being unsupported.
 
Find the '<tt>case ${host} in</tt>' just prior to '<tt>extra_parts="$extra_parts crtbegin.o crtend.o crti.o crtn.o"</tt>' (around line 368) and add the cases:
==== Newer GCC versions (4.7.0 and newer?) ====
For GCC 4.7.0 and newer, the header 'Support site-specific machine types' is removed but you still need to add a definition for your os otherwise the build process will cause an error.
The section that needs editing is near the end of the file; around line 1140. There is a default entry which causes an error - you need to add an entry for your OS to this case statement; you can add it just above the default "*)" entry.
For these GCC versions, it appears the crtbegin/crtend generation has been moved to [[libgcc]]. We therefore need to add lines like these, rather than just the empty statement for previous versions:
i[3-7]86-*-myos*)
extra_parts="crtbegin.o crtend.o"
tmake_file="$tmake_file i386/t-crtstuff"
;;
 
<syntaxhighlight lang="bash">
Without extra_parts, crtbegin.o and crtend.o are not built; without the tmake_file addition, GCC can give the error "error in /.../crtend.o(.eh_frame); no .eh_frame_hdr table will be created.", as discussed in [http://forum.osdev.org/viewtopic.php?t=25489 this forum thread].
i[34567]86-*-myos*)
extra_parts="$extra_parts crti.o crtbegin.o crtend.o crtn.o"
tmake_file="$tmake_file i386/t-crtstuff t-crtstuff-pic t-libgcc-pic"
;;
x86_64-*-myos*)
extra_parts="$extra_parts crti.o crtbegin.o crtend.o crtn.o"
tmake_file="$tmake_file i386/t-crtstuff t-crtstuff-pic t-libgcc-pic"
;;
</syntaxhighlight>
 
=== fixincludes/mkfixinc.sh ===
== Selecting a C Library ==
At this point, you have to decide which [[C Library]] to use. You can either pick an existing [[C Library]] (such as newlib) or [[Creating_a_C_Library|create your own libc]].
 
You should disable fixincludes for your operating system. Find the case statement and add a pattern for your operating system. For instance:
== Newlib ==
=== config.sub ===
Same as for binutils
 
<syntaxhighlight lang="bash">
=== newlib/configure.host ===
# Check for special fix rules for particular targets
Tell newlib which system-specific directory to use for our particular target. In the section starting 'Get the source directories to use for the host ... case "${host}" in', add a section
case $machine in
i[3-7]86-*-myos*)
sys_dir=*-myos* | \
;;*-*-myos* | \
i?86-*-cygwin* | \
# (... snip ...)
powerpcle-*-eabi* )
# IF there is no include fixing,
# THEN create a no-op fixer and exit
(echo "#! /bin/sh" ; echo "exit 0" ) > ${target}
;;
</syntaxhighlight>
 
A number of operating systems (especially older and obscure ones) provide troublesome systems headers that fail to strictly comply with various standards. The GCC developers consider it their job to fix these headers. GCC will look into your system root, apply a bunch of patterns to detect headers it doesn't like, then it copies that header into a private GCC system directory (that overrides your standard system directory) and attempts to fix the header. Sometimes fixincludes even break working headers (some people refer to it as breakincludes).
=== newlib/libc/sys/configure.in ===
Tell the newlib build system that it also needs to configure our myos-specific host directory. In the 'case ${sys_dir} in' list, simply add
myos) AC_CONFIG_SUBDIRS(myos) ;;
 
This is rather inconvenient as your libc will likely happen to trigger these patterns (and false positives often happens). Any time you change your system headers, you have to rebuild your compiler so the fixed versions get updated. The first time you encounter this, it will show up as a system header that does nothing different even though you edit it.
'''After this, you need to run 'autoconf' in the newlib/libc/sys directory.'''
 
This addition to the <tt>mkfixinc.sh</tt> file forcefully disables fixincludes for your operating system. It's your job to provide working system headers, not the compiler developers'.
=== newlib/libc/sys/myos ===
This is a directory that we need to create where we put our OS-specific extensions to newlib. We need to create a minimum of 4 files. You can easily add more files to this directory to define your own os-specific library functions, if you want them to be included in libc.a (and so linked in to every application by default).
 
== Further Customization ==
==== crt0.S ====
This file creates crt0.o, which is included in every application. It should define the symbol _start, and then call the main() function, possibly after setting up process-space segment selectors and pushing argc and argv onto the stack. A simple implementation is:
.global _start
.extern main
.extern exit
_start:
call main
call exit
.wait: hlt
jmp .wait
 
'''TODO''': Document more various tips and tricks for further customization of OS specific toolchains.
It is also worth mentioning that crt0 can be written in C instead of Assembly. There are a couple of reasons why you may want to do so, including that you would be able to properly find the entry point to programs written in C++, without having to worry about name mangling or using C linkage. It is also easier (marginally) to handle the argc and argv parameters in C.
 
=== Changing the Default Include Directory ===
==== syscalls.c ====
This file should contain implementations for each of the system calls that newlib depends on. There is a list on the newlib website but I believe it to be slightly out of date as my version had some extra ones not documented there. Generally, each of these system calls should trigger an interrupt or use sysenter/syscall to run a kernel-space system call. As such, they are heavily OS-specific. A non-exhaustive list is:
<pre>
/* note these headers are all provided by newlib - you don't need to provide them */
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/fcntl.h>
#include <sys/times.h>
#include <sys/errno.h>
#include <sys/time.h>
#include <stdio.h>
 
If you wish to change the default include directory from <tt>/usr/include</tt>, you can override the <tt>native_system_header_dir</tt> variable in <tt>gcc/config.gcc</tt> in the case for your OS.
void _exit();
int close(int file);
char **environ; /* pointer to array of char * strings that define the current environment variables */
int execve(char *name, char **argv, char **env);
int fork();
int fstat(int file, struct stat *st);
int getpid();
int isatty(int file);
int kill(int pid, int sig);
int link(char *old, char *new);
int lseek(int file, int ptr, int dir);
int open(const char *name, int flags, ...);
int read(int file, char *ptr, int len);
caddr_t sbrk(int incr);
int stat(const char *file, struct stat *st);
clock_t times(struct tms *buf);
int unlink(char *name);
int wait(int *status);
int write(int file, char *ptr, int len);
int gettimeofday(struct timeval *p, struct timezone *z);
</pre>
 
=== Changing the Default Library Directory ===
==== configure.in ====
Configure script for our system directory.
AC_PREREQ(2.59)
AC_INIT([newlib], [NEWLIB_VERSION])
AC_CONFIG_SRCDIR([crt0.S])
AC_CONFIG_AUX_DIR(../../../..)
NEWLIB_CONFIGURE(../../..)
AC_CONFIG_FILES([Makefile])
AC_OUTPUT
 
If you wish to change the default library directory from <tt>/usr/lib</tt>, you can change it to <tt>/lib</tt> by adding the following block of code to the case just below the declaration of <tt>NATIVE_LIB_DIRS</tt> in <tt>binutils/ld/configure.tgt</tt> (around line 1056).
==== Makefile.am ====
A Makefile template for this directory
<pre>
AUTOMAKE_OPTIONS = cygnus
INCLUDES = $(NEWLIB_CFLAGS) $(CROSS_CFLAGS) $(TARGET_CFLAGS)
AM_CCASFLAGS = $(INCLUDES)
 
<syntaxhighlight lang="bash">
noinst_LIBRARIES = lib.a
*-*-myos*)
NATIVE_LIB_DIRS='/lib /local/lib'
;;
</syntaxhighlight>
 
=== Start Files Directory ===
if MAY_SUPPLY_SYSCALLS
extra_objs = $(lpfx)syscalls.o
else
extra_objs =
endif
 
You can modify which directory GCC looks for the crt0.o, crti.o and crtn.o in. The path to that directory is stored in <tt>STANDARD_STARTFILE_PREFIX</tt>. For instance, if you change the library directory to <tt>/lib</tt> in Binutils and want GCC to match, you can add the following to <tt>gcc/config/myos.h</tt>:
lib_a_SOURCES =
lib_a_LIBADD = $(extra_objs)
EXTRA_lib_a_SOURCES = syscalls.c crt0.S
lib_a_DEPENDENCIES = $(extra_objs)
lib_a_CCASFLAGS = $(AM_CCASFLAGS)
lib_a_CFLAGS = $(AM_CFLAGS)
 
<syntaxhighlight lang="C">
if MAY_SUPPLY_SYSCALLS
#undef STANDARD_STARTFILE_PREFIX
all: crt0.o
#define STANDARD_STARTFILE_PREFIX "/lib/"
endif
</syntaxhighlight>
 
Note that the trailing slash is important as the raw crt*.o names are appended without first adding a slash.
ACLOCAL_AMFLAGS = -I ../../..
CONFIG_STATUS_DEPENDENCIES = $(newlib_basedir)/configure.host
</pre>
 
'''After this, you need to run 'autoconf' in the newlib/libc/sys/ directory, and 'autoreconf' in the newlib/libc/sys/myos directory.'''
''Note: 'autoconf' and 'autoreconf' will only run with automake version <= 1.12 and autoconf version 2.64 (exactly) (applies to newlib source pulled from git repository July 31 2013)''
 
=== SignalDynamic handlinglinking ===
If you want to support dynamic linking with your toolchain, you need to...
Newlib has two different mechanisms for dealing with UNIX signals (see the man pages for signal()/raise()). In the first, it provides its own emulation, where it maintains a table of signal handlers in a per-process manner. If you use this method, then you will only be able to respond to signals sent from within the current process. In order to support it, all you need to do is make sure your crt0 calls '_init_signal' before it calls main, which sets up the signal handler table.
* Pass the <tt>"--enable-shared"</tt> flag to the <tt>./configure</tt> scripts of both binutils and gcc
* Set <tt>LINK_SPEC</tt> in your "gcc/config/myos.h" so the gcc driver will pass the correct flags to the linker:
 
<syntaxhighlight lang="C">
#undef LINK_SPEC
#define LINK_SPEC "%{shared:-shared} %{static:-static} %{!shared: %{!static: %{rdynamic:-export-dynamic}}}"
</syntaxhighlight>
 
=== Linker Options ===
 
You can modify the arguments passed to the linker using <tt>LINK_SPEC</tt>. This can be used to force 4KB alignment of sections on 64 bit systems, as ld defaults to 2MB alignment.
 
<syntaxhighlight lang="C">
/* Tell ld to force 4KB pages*/
#undef LINK_SPEC
#define LINK_SPEC "-z max-page-size=4096"
</syntaxhighlight>
 
== Selecting a C Library ==
 
{{Main|C Library}}
Alternatively, you can provide your own implementation. To do this, you need to define your own version of signal() in syscalls.c. A typical implementation would register the handler somewhere in kernel space, so that issuing a signal from another process causes the corresponding function to be called in the receiving process (this will also require some nifty stack-playing in the receiving process, as you are basically interrupting the program flow in the middle). You then need to provide a kill() function in syscalls.c which actually sends signals to another process. Newlib will still define a raise() function for you, but it is just a stub which calls kill() with the current process id. To switch newlib to this mode, you need to #define the SIGNAL_PROVIDED macro when compiling. A simple way to do this is to add the line:
newlib_cflags="${newlib_cflags} -DSIGNAL_PROVIDED"
to your host's entry in newlib/configure.host. It would probably also make sense to provide sigaction(), and provide signal() as a wrapper for it. Note that [http://www.opengroup.org/onlinepubs/007908799/xsh/sigaction.html OpenGroup.org's] definition of sigaction states that 1) sigaction supersedes signal, and 2) an application designed shouldn't use both to manipulate the same signal.
 
At this point, you have to decide which [[C Library]] to use. You have options:
= Building =
Configuring and building should be done in the build-xxx directory specific to the package you are building. Do not attempt to configure or build within the source directory. It is a method not supported by either myself or GNU. The configure option --disable-nls is optional, although I haven't tested without it.
 
* [[Creating a C_Library|Create your own C library]].
Note that on multi-processor build machines, you may be able to speed the build process by adding "-j X" (where X is the number of jobs to run in parallel) to the make commands below.
* Pick an existing [[C Library]] such as [[Porting Newlib|Newlib]].
 
== Building ==
Also note that if your build environment requires you to make the install target as a different user (such as root), you'll need to add $PREFIX/bin to the $PATH environment variable for that user as well.
 
{{Main|Hosted GCC Cross-Compiler}}
=== Note for Mac OS X users ===
The [[GCC_Cross-Compiler#MacOS_users.2C_beware|warning]] listed in the GCC Cross-Compiler article applies here as well. Make sure to read it and set up $CC/$CXX/$CPP/$LD to point to a "real" GCC, as opposed to LLVM-GCC, or your build will most likely fail!
 
Your OS specific toolchain is built differently from the introductory <tt>i686-elf</tt> toolchain as it has a user-space and standard library. In particular, you need to ensure your libc meets the minimum requirements for libgcc. You need to install the standard library headers into your [[Meaty_Skeleton#System_Root|System Root]] before building the cross-compiler. You need to tell the cross-binutils and cross-gcc where the system root is via the configure option <tt>--with-sysroot=/path/to/sysroot</tt>. You can then build your libc with your cross-compiler and then finally libstdc++ if desired.
== Binutils ==
From /usr/src/build-binutils, run
../binutils-2.18/configure --target=$TARGET --prefix=$PREFIX --disable-werror
make
make install
export PATH=$PATH:$PREFIX/bin
 
== GCCConclusion ==
From /usr/src/build-gcc, run (only use --enable-languages=c if you haven't downloaded and unpacked gcc-g++!)
../gcc-4.2.1/configure --target=$TARGET --prefix=$PREFIX --enable-languages=c,c++
make all-gcc
make install-gcc
 
You now have a <tt>i686-myos</tt> toolchain that can be used instead of your old <tt>i686-elf</tt> toolchain. Your new toolchain is effectively just a renamed <tt>i686-elf</tt> with customizations. You should switch all your operating system build scripts to use this new compiler, even the kernel and libk, as your new compiler is capable of providing a freestanding environment.
You will also want libgcc. Run
make all-target-libgcc
make install-target-libgcc
 
You will certainly wish to package up your custom toolchain (and be able to create a diff between the upstream version and your custom version for others to audit). Contributors should be able to download tarballs of your myos-binutils and myos-gcc packages, so they can build themselves your custom toolchain.
== Newlib ==
From /usr/src/build-newlib, run
../newlib-1.15.0/configure --target=$TARGET --prefix=$PREFIX
make
make install
and optionally
make install-info
 
== Libstdc++Common errors ==
From /usr/src/build-gcc, run
make
make install
Note that libstdc++ can only be built after installing newlib, as it depends on libc.
 
=== Whitespaces ===
= Common errors =
 
Some files need tabs, some files need spaces and some files accept happily any mixture. Use an editor that can display special chars such as tabs and spaces, to be sure you use the right form. Whitespace errors may result in 'make' reporting missing separators. Some editors will replace a tab with four spaces, which will also cause invalid separator issues.
== Whitespaces ==
Some files need tabs, some files need spaces and some files accept happily any mixture. Use an editor that can display special chars such as tabs and spaces, to be sure you use the right form.
Whitespace errors may result in 'make' reporting missing separators. Some editors will replace a tab with four spaces, which will also cause invalid separator issues (Code::Blocks 8.02 does this - you need to switch it off in Settings > Editor)
 
=== Autoconf == =
There are several steps that conclude in running 'autoconf' or 'autoreconf', be sure you did not miss them. The order of autoconf/-reconf calls in a packet is important.
These errors may result in missing subdirectories of the build-* directory and/or 'make' reporting missing targets.
 
There are several steps that conclude in running '<tt>autoconf</tt>' or '<tt>automake</tt>', '<tt>autoreconf</tt>', be sure you did not miss them. The order of autoconf/-reconf calls in a package is important. These errors may result in missing subdirectories of the build-* directory and/or 'make' reporting missing targets.
= Caveats =
== Using the cross toolchain to compile your kernel ==
Generally you don't want libc and crt0.o in your kernel. Link your kernel with the -nostdlib option to i586-pc-myos-gcc. Yes, gcc is also a front-end to ld, and is probably the preferred way to interface to it. You should specify a linker script to link your kernel, otherwise the kernel's .text section will be at the same location as a process' .text section, which is a bad thing. Use the -Wl,-Tlinkerscript.ld option.
 
== See Also ==
== Linking a kernel with libsupc++ ==
You can use your libsupc++ to get exception handling and RTTI in a C++ kernel (no more passing -fno-exceptions -fno-rtti to g++!) so you can use things like throw and dynamic_cast<>. Libsupc++ depends upon [[libgcc]] for stack unwinding support. Passing the -nostdlib option to gcc when linking caused libgcc.a and libsupc++.a to not be included, so you need to specify -lgcc -lsupc++ on the command line (no need to specify the directories; gcc knows where it installed them to). In addition, you need to include a .eh_frame section in your linker script and terminate it with 32 bits of zeros (QUAD(0) is a useful linker script command). The symbol start_eh_frame should point to the start of the eh_frame section, and it should be aligned by 4. In addition you need to include your constructors and destructors in the link (see [[C++]] for details). You also need to provide __register_frame() (or call the function provided by libgcc with the start of your .eh_frame section), void *__dso_handle;, __cxa_atexit() and __cxa_finalize (again see [[C++]]). Something along the lines of
#include <reent.h>
static struct _reent global_reent;
struct _reent *_impure_ptr = &global_reent;
somewhere in your kernel will keep libgcc happy, because it expects these bits to be provided by newlib (which you aren't linking into your kernel). Libgcc expects a number of (simple) C library functions to be provided (again normally by newlib) by your kernel, including abort, malloc, free, memcpy, memset and strlen. Libsupc++ also requires write, fputs, fputc, fwrite, strcpy and strcat for debugging output.
 
=== DisclaimerArticles ===
This tutorial is provided for people who are already happy compiling a [[GCC Cross-Compiler]] and are generally sure of what they are doing. It is not, in any way, intended to replace the already excellent articles on [[GCC Cross-Compiler]] and [[Porting Newlib]] but is instead intended as a further resource for those who wish to take another step towards having their OS become self-hosting. The other tutorials mentioned are extensively tested and known to work, whereas the steps described here (given their increased complexity) are expected to be less well tested and could well include a number of bugs.
 
* [[GCC]]
=See Also=
===External Links===
* Inspired by this article, [[Boomstick]] is a script to build a complete GCC toolchain, including newlib, for your OS. Just fill in the stubs! (or don't :)
 
[[Category:ToolsToolchains]]
[[Category:Tutorials]]