OS Specific Toolchain: Difference between revisions

change categorization
[unchecked revision][unchecked revision]
(change categorization)
 
(29 intermediate revisions by 19 users not shown)
Line 1:
{{Rating|3}}
 
This tutorial will guide you through creating a toolchain comprising binutilsBinutils and gccGCC that specifically targets your operating system. The instructions below teach binutilsBinutils and gccGCC how to create programs for a hypothetical OS named 'MyOS'.
 
Until now you have been using a [[GCC Cross-Compiler|cross-compiler]] configured to use an existing generic bare target. This is very convenient when starting out as you get a reliable target and the compiler doesn't make any bad assumptions because it thinks it is targeting an existing operating system. However, when you proceed it becomes useful if the compiler knows it is targeting your operating system and what its customs are. For instance, you can make the compiler define a <tt>__myos__</tt> preprocessor macro, know which directories to search for include files in, what special <tt>crt*.o</tt> files are used when linking against libc, and so on. It also becomes much easier to cross-compile software to your OS when you simply have to invoke <tt>x86_64-myos-gcc hello.c -o hello</tt> to cross-compile a program. Additionally part of the instructions here can be applied to other software packages that also use the GNU build system, which will help you [[Cross-Porting Software|port existing software]].
 
This tutorial teaches you how to set up a cross-compiler that specifically targets your OS. This is actually the first step of ''porting binutilsBinutils and gccGCC to your operating system'': Any information you give GCC about your OS will help it run on your OS. Once your OS Specific Toolchain has been set up and you have built your OS with it, you can continue by using the cross-compiler to cross-compile the compiler itself to your OS, assuming your libc and kernel is powerful enough. For more information see [[Porting GCC to your OS]].
 
== Introduction ==
Line 12:
 
* A build environment that can successfully build a [[GCC Cross-Compiler]].
* autoconf (exactly version 2.6469)
* automake (exactly version 1.15.1)
* libtool
* The latest binutilsBinutils source code. ('''Note''':2.39 Thisat tutorialtime hasof not yet been updated to cover binutils 2writing).25)
* The latest gccGCC source code (12.2.0 at time of writing).
* Knowledge of the internals of binutilsBinutils and gccGCC.
* Knowledge of autoconf and automake.
* The dependencies of binutilsBinutils and gccGCC as detailed in [[GCC Cross-Compiler]].
 
If you're compiling a later version of GCC/binutils, find versions of autoconf and automake that were released just prior to your version of GCC/binutils. See the [[Cross-Compiler Successful Builds]] page for known compatible versions of GCC and binutils.
 
Additionally you will need a [[C Library]] as described in a later section. As detailed in [[Hosted GCC Cross-Compiler]], it doesn't need to support much and the functionality can be stubbed, but libgcc will need to believe you have a libc.
 
You should decide exactly what targets you'll add to binutilsBinutils and gccGCC. If you have been using a generic <tt>i686-elf</tt> or <tt>x86_64-elf</tt> or such target, you'll simply want to swap <tt>-elf</tt> with <tt>-myos</tt> and get <tt>i686-myos</tt> and <tt>x86_64-myos</tt>. Naturally, don't actually write myos, but rather use the name of your OS converted to lower case. See [[Target Triplet]].
 
This tutorial currently only have instructions for adding a new x86 and x86_64 target for myos, but it serves as a good enough example that it should be trivial to add more processors by basing it on these instructions and what other operating systems have done.
 
 
== Making your changes reproducible ==
Since at some point you might have a contributor that wants participate in your project, you likely want to make your toolchain setup reproducible. Instead of downloading the tar-balls, it is therefore a good idea to clone the repositories of binutils (https://sourceware.org/git/binutils-gdb.git) and gcc (https://gcc.gnu.org/git/gcc.git) locally.
 
All release versions are tagged within these repositories, so you can use <tt>"git checkout binutils-2_39"</tt> and <tt>"git checkout releases/gcc-12.2.0"</tt> to switch to the correct versions respectively. After making the described changes, you can easily create a patch using <tt>"git diff"</tt> which you can reuse later.
Another option would be to fork one of the mirror repositories on GitHub.
 
 
== Modifying Binutils ==
Line 31 ⟶ 41:
=== config.sub ===
 
This is a file you will modify in the same way for each package. It is a GNU standard file produced by including the line 'AC_CANONICAL_SYSTEM' in a configure.ac that is processed by autoconf, and is designed to convert a canonical name of the form <tt>i686-pc-myos</tt> into separate variables for the processor, vendor and OS, and also rejects systems it doesn't know about. We simply need to add 'myos' to the list of acceptable operating systems. Find the section that begins with the comment "<tt>FirstNow accept the basic system types</tt>" (it begins '<tt>-gnu*</tt>') and add '<tt>-myos*</tt>' to the list. IFind typicallya addline it after <tt>-aos*</tt> simply because there iswith some free room onand theadd lineyour entry there.
 
=== bfd/config.bfd ===
 
This file is part of the configuration for libbfd, the back-end to binutilsBinutils which provides a consistent interface for many object file formats. Generally, each platform-specific version of binutilsBinutils contains a libbfd which only supports the object files normally in use on that system, as otherwise the library would be massive (libbfd can support a _lot_ of object types). We need to associate our os with some particular object types. There is a long list starting 'WHEN ADDING ENTRIES TO THIS MATRIX' with the first line as 'case "${targ}" in'. We need to add our full canonical name to this list, by adding some cases such as:
 
<sourcesyntaxhighlight lang="bash">
i[3-7]86-*-myos*)
targ_defvec=bfd_elf32_i386_veci386_elf32_vec
targ_selvecs=
targ64_selvecs=bfd_elf64_x86_64_vecx86_64_elf64_vec
;;
#ifdef BFD64
x86_64-*-myos*)
targ_defvec=bfd_elf64_x86_64_vecx86_64_elf64_vec
targ_selvecs=bfd_elf32_i386_veci386_elf32_vec
want64=true
;;
#endif
</syntaxhighlight>
</source>
 
<b>'''Note:''' </b> This does notIf compile withusing binutils-2.25,24 howeveror changingolder, change <i>bfd_elf32_i386_veci386_elf32_vec</i> to <i>i386_elf32_vecbfd_elf32_i386_vec</i> helped.and (Possibly<i>x86_64_elf64_vec</i> similarto solutions exist for the other targets)<i>bfd_elf64_x86_64_vec</i>.
 
Be sure to follow the instructions in the comment block above the list and add your entry beneath the comment "<tt>#START OF targmatch.h</tt>". If you like, you could support different object formats (look at other entries in the list, and the contents of 'bfd' for hints) and also provide more than one to the <tt>targ_selvecs</tt> line. For instance, you can support coff object files if you add <tt>i386coff_vec</tt> to the <tt>targ_selvecs</tt> list. For some reason, allAll the <tt>x86_64</tt> entries in the file file are wrapped in <tt>#ifdef BFD64</tt>, it'sas probablywhen prudenttargmatch.sed processes this file to doturn it yourselfinto asto welltargmatch.h, it will add <tt>#ifdef BFD64</tt> so that the relevant code is only compiled when targeting a 64 bit platform
 
=== gas/configure.tgt ===
Line 60 ⟶ 70:
This file tells the gnu assembler what type of output to generate for each target. It automatically matches the i686 part of your target and generates the correct output for that. We just need to tell it what type of object file to generate for myos. In the section starting '<tt>Assign object format ... case ${generic_target} in</tt>' you need to add a line like
 
<sourcesyntaxhighlight lang="bash">
i386-*-myos*) fmt=elf ;;
</syntaxhighlight>
</source>
 
You should use '<tt>i386</tt>' in this line even if you are targeting x86_64. This is the only file where you should do it. It is basically because the variable 'generic_target' is not your canonical target name, but rather a variable generated further up in the configure.tgt file, and it sets the first part to <tt>i386</tt> for any <tt>i[3-7]86</tt> or <tt>x86_64</tt>.
 
Note: this will use the 'generic' emulation. One side-effect is that gas will interpret slash ('/') as a comment, not as a division operator. This will break any code like "<tt>movl $(ADDRESS/PAGE_SIZE), %eax</tt>". Using "<tt>fmt=elf em=gnu ;;</tt>" or "<tt>fmt=elf em=linux ;;</tt>" will disable slash as a comment character.
 
=== ld/configure.tgt ===
 
This file tells the gnu linker what 'emulation' to use for each target. An emulation is basically a combination of linker script and executable file format. We are going to define our own emulation called <tt>elf_i386_myos</tt>. We need to add an entry to the case statement here after '<tt>Please try to keep this table more or less in alphabeticalalphabetic order ... case "${targ}</tt>" in':
 
<sourcesyntaxhighlight lang="bash">
i[3-7]86-*-myos*)
targ_emul=elf_i386_myos
Line 78 ⟶ 90:
x86_64-*-myos*)
targ_emul=elf_x86_64_myos
targ_extra_emuls="elf_i386_myos elf_x86_64 elf_i386"
;;
</syntaxhighlight>
</source>
 
* '''elf_i386_myos''' is a 32-bit target for your OS.
Line 93 ⟶ 105:
Now we need to actually define our emulation. There is a generic file called <tt>ld/genscripts.sh</tt> which creates the required linker scripts for our target (you need more than one, depending on shared object usage and the like: I have 13 for a single target). It uses a linker script template (from the ld/scripttempl directory) to do this, and it creates the actual emulation C file from an emulation template (from the ld/emultempl directory). These templates are customised by running a script in the ld/emulparams directory which sets various variables. You are welcome to define your own emulation and linker templates, but I find the ELF ones adequate, given that they can be customised by simply adding a file to the emulparams directory. This is what we are going to do now. The content of the file could be something like:
 
<sourcesyntaxhighlight lang="bash">
.source_sh ${srcdir}/emulparams/elf_i386.sh
TEXT_START_ADDR=0x08000000
GENERATE_SHLIB_SCRIPT=yes
</syntaxhighlight>
GENERATE_PIE_SCRIPT=yes
</source>
 
This script is included by <tt>ld/genscripts.sh</tt> to customize its behavior through shell variables. We include the base <tt>elf_i386.sh</tt> script as it sets reasonable defaults. Finally, we override the variables whose defaults we disagree with.
Line 103 ⟶ 114:
There are a large number of variables that can be set here to customize your toolchain. Read the documentation and look at existing emulations for further information. These are some of the variables that can be set:
 
* '''GENERATE_SHLIB_SCRIPT'''=''yes|no'' Whether to generate a linker script for shared libraries. We enable this as you might want it later. The base 32-bit elf script defaults this to disabled for some reason.
* '''GENERATE_PIE_SCRIPT'''=''yes|no'' Whether to generate a linker script for position independent executables. We enable this as you might want it later. The base 32-bit elf script defaults this to disabled for some reason.
* '''SCRIPT_NAME'''=''name'' Controls which ld/scripttempl/''name''.sc script generates our linker scripts.
* '''TEMPLATE_NAME'''=''name'' Controls which ld/emultempl/''name''.em script generates our bfd emulation C implementation.
Line 116 ⟶ 127:
This file is just like the above <tt>ld/emulparams/elf_i386_myos.sh</tt> but for x86_64.
 
<sourcesyntaxhighlight lang="bash">
.source_sh ${srcdir}/emulparams/elf_x86_64.sh
</syntaxhighlight>
</source>
 
There is no reason to set <tt>GENERATE_SHLIB_SCRIPT</tt> and <tt>GENERATE_PIE_SCRIPT</tt> here as the x86_64 base script enables them by default.
 
=== ld/Makefile.am ===
 
We now just need to tell make how to produce the emulation C file for our specific emulation. Putting the '<tt>targ_emul=elf_i386_myos</tt>' line into <tt>ld/configure.tgt</tt> above implies that your host linker will try to link your target ld executable with an object file called <tt>eelf_i386_myos.o</tt>. There is a default rule to generate this from <tt>eelf_i386_myos.c</tt>, so we just need to tell it how to make this <tt>eelf_i386_myos.c</tt> file. As stated above, we let the genscripts.sh file do the hard work. You need to add makefile<tt>eelf_i386_myos.c</tt> rules:to the <tt>ALL_EMULATION_SOURCES</tt> list; you also need to add <tt>eelf_x86_64_myos.c</tt> to the <tt>ALL_64_EMULATION_SOURCES</tt> list if applicable.
 
<source lang="make">
# Add this after eelf_i386.c:
eelf_i386_myos.c: $(srcdir)/emulparams/elf_i386_myos.sh \
$(ELF_DEPS) $(srcdir)/scripttempl/elf.sc ${GEN_DEPENDS}
${GENSCRIPTS} elf_i386_myos "$(tdir_elf_i386_myos)"
 
# Add this after eelf_x86_64.c:
eelf_x86_64_myos.c: $(srcdir)/emulparams/elf_x86_64_myos.sh \
$(ELF_DEPS) $(srcdir)/scripttempl/elf.sc ${GEN_DEPENDS}
${GENSCRIPTS} elf_x86_64_myos "$(tdir_elf_x86_64_myos)"
</source>
 
'''Note''': Some parts of the line use normal brackets () whereas other parts use curly braces {}.
 
'''Note''': The third line must start with single ''tab'', not spaces, as this is a [[Makefile]].
 
You also need to add <tt>eelf_i386_myos.c</tt> to the <tt>ALL_EMULATION_SOURCES</tt> list; and you also need to add <tt>eelf_x86_64_myos.c</tt> to the <tt>ALL_64_EMULATION_SOURCES</tt> list.
 
'''Note''': You ''must'' run <tt>automake</tt> in the <tt>ld</tt> directory after you modify <tt>Makefile.am</tt> to regenerate <tt>Makefile.in</tt>.
Line 150 ⟶ 141:
=== config.sub ===
 
Similar modification to config.sub in binutilsBinutils.
 
=== gcc/config.gcc ===
Line 156 ⟶ 147:
This file defines what needs to be built for each particular target and what to include in the final executable. There are two main sections: one which defines generic options for your operating system, and those which define options specific to your operating system on each individual machine type.
 
For the first part, find the '<tt>case ${target} in</tt>' line just after '<tt># Common parts for widely ported systems</tt>' (around line 617688) and add something like:
 
<sourcesyntaxhighlight lang="bash">
*-*-myos*)
gas=yes
gnu_ld=yes
default_use_cxa_atexit=yes
use_gcc_stdint=provide
;;
</syntaxhighlight>
</source>
 
This* states that'''gas=yes''' our operating system by default uses the GNU linker and assembler and that we will provide __cxa_atexit (you will need to provide this in your standard library).
* '''gnu_ld=yes''' out operating system by default uses the GNU linker
* '''default_use_cxa_atexit=yes''' We will provide ''__cxa_atexit'' (You will need to provide this in your standard library)
* '''use_gcc_stdint=provide''' This instructs gcc to provide you with a ''stdint.h'' appropiate for your target. Change <tt>provide</tt> to <tt>wrap</tt> if you have your own ''stdint.h'', to make GCC wrap yours.
 
The second section we need to add to is the architecture-specific one. Find the '<tt>case ${target} in</tt>' line just before '<tt>tm_file="${tm_file} dbxelf.h elfos.h newlib-stdint.h"</tt>' (around line 8861094) and add something like:
 
<sourcesyntaxhighlight lang="bash">
i[34567]86-*-myos*)
tm_file="${tm_file} i386/unix.h i386/att.h dbxelfelfos.h elfosglibc-stdint.h i386/i386elf.h myos.h"
;;
x86_64-*-myos*)
tm_file="${tm_file} i386/unix.h i386/att.h dbxelfelfos.h elfosglibc-stdint.h i386/i386elf.h i386/x86-64.h myos.h"
;;
</syntaxhighlight>
</source>
 
This defines which target configuration header files gets used. You can make <tt>i386/myos32.h</tt> and <tt>i386/myos64.h</tt> files if desired.
Line 185 ⟶ 180:
This header allows you to customize your toolchain using preprocessor macros. The relevant parts of GCC will include this header (as controlled by <tt>gcc/config.gcc</tt>) and modify the behavior according to your customizations.
 
You can explore <tt>gcc/defaults.h</tt> for a full list of things you can modify, and more importantly, the assumptions GCC will make about various aspects of your target. For instance, if <tt>PID_TYPE</tt> is not defined in <tt>myos.h</tt>, then GCC will default it to <tt>int</tt>, which can be problematic if that's not what your <tt>pid_t</tt> is defined as.
<source lang="C">
 
/* Useful if you wish to make target-specific gcc changes. */
<syntaxhighlight lang="C">
/* Useful if you wish to make target-specific GCC changes. */
#undef TARGET_MYOS
#define TARGET_MYOS 1
Line 192 ⟶ 189:
/* Default arguments you want when running your
i686-myos-gcc/x86_64-myos-gcc toolchain */
#undef LIB_SPEC
#define LIB_SPEC "-lc -lg -lm" /* link against C standard libraries */
#define LIB_SPEC "-lc" /* link against C standard library */
/* modify this based on your needs */
 
/* Files that are linked before user code.
/* Don't automatically add extern "C" { } around header files. */
The %s tells GCC to look for these files in the library directory. */
#undef NO_IMPLICIT_EXTERN_C
#undef STARTFILE_SPEC
#define NO_IMPLICIT_EXTERN_C 1
#define STARTFILE_SPEC "crt0.o%s crti.o%s crtbegin.o%s"
 
/* Files that are linked after user code. */
#undef ENDFILE_SPEC
#define ENDFILE_SPEC "crtend.o%s crtn.o%s"
 
/* Additional predefined macros. */
Line 209 ⟶ 211:
builtin_assert ("system=posix"); \
} while(0);
</syntaxhighlight>
</source>
 
'''TODO''': It would make sense to document the handling of <tt>STARTFILE_SPEC</tt>, <tt>ENDFILE_SPEC</tt>here as well.
 
=== libstdc++-v3/crossconfig.m4 ===
Line 217:
This file describes how the libstdc++ configure file will examine your operating system and adjust the provided features of libstdc++ accordingly. Add a case similar to
 
<sourcesyntaxhighlight lang="autoconfbash">
*-myos*)
GLIBCXX_CHECK_COMPILER_FEATURES
Line 224:
GLIBCXX_CHECK_STDLIB_SUPPORT
;;
</syntaxhighlight>
</source>
 
'''TODO''': Examine this design and find out what actually needs to be done here.
 
'''Note''': You need to run <tt>autoconf</tt> in the <tt>libstdc++-v3</tt> directory.
Line 232 ⟶ 230:
=== libgcc/config.host ===
 
Find the '<tt>case ${host} in</tt>' just prior to '<tt>extra_parts="$extra_parts crtbegin.o crtend.o crti.o crtn.o"</tt>' (around line 318368) and add the cases:
 
<sourcesyntaxhighlight lang="bash">
i[34567]86-*-myos*)
extra_parts="$extra_parts crti.o crtbegin.o crtend.o crtn.o"
tmake_file="$tmake_file i386/t-crtstuff t-crtstuff-pic t-libgcc-pic"
;;
x86_64-*-myos*)
extra_parts="$extra_parts crti.o crtbegin.o crtend.o crtn.o"
tmake_file="$tmake_file i386/t-crtstuff t-crtstuff-pic t-libgcc-pic"
;;
</syntaxhighlight>
</source>
 
=== fixincludes/mkfixinc.sh ===
Line 249 ⟶ 247:
You should disable fixincludes for your operating system. Find the case statement and add a pattern for your operating system. For instance:
 
<sourcesyntaxhighlight lang="bash">
# Check for special fix rules for particular targets
case $machine in
*-myos* | \
*-*-myos* | \
i?86-*-cygwin* | \
# (... snip ...)
Line 260 ⟶ 259:
(echo "#! /bin/sh" ; echo "exit 0" ) > ${target}
;;
</syntaxhighlight>
</source>
 
A number of operating systems (especially older and obscure ones) provide troublesome systems headers that fail to strictly comply with various standards. The GCC developers consider it their job to fix these headers. GCC will look into your system root, apply a bunch of patterns to detect headers it doesn't like, then it copies that header into a private gccGCC system directory (that overrides your standard system directory) and attempts to fix the header. Sometimes fixincludes even break working headers (some people refer to it as breakincludes).
 
This is rather inconvenient as your libc will likely happen to trigger these patterns (and false positives often happens). Any time you change your system headers, you have to rebuild your compiler so the fixed versions get updated. The first time you encounter this, it will show up as a system header that does nothing different even though you edit it.
Line 274 ⟶ 273:
=== Changing the Default Include Directory ===
 
If you wish to change the default include directory from <tt>/usr/include</tt>, you can override the <tt>native_system_header_dir</tt> variable in <tt>gcc/config.gcc</tt> in the case for your OS.
If you belong to the group that has decided that the /usr directory is big and evil, you may wish to change the default include directory path. If you wish to do so, you can simply add this to your gcc/gcc/config/myos.h:
 
<source lang="c">
/* Standard include directory. */
#undef STANDARD_INCLUDE_DIR
#define STANDARD_INCLUDE_DIR "/include"
</source>
 
=== Changing the Default Library Directory ===
 
If you wish to change the default library directory from <tt>/usr/lib</tt>, you can change it to <tt>/lib</tt> by adding the following block of code to the case just below the declaration of <tt>NATIVE_LIB_DIRS</tt> in <tt>binutils/ld/configure.tgt</tt> (around line 1056).
'''TODO''': Document this, but it is done by changing binutils and is a bit more complex
 
<syntaxhighlight lang="bash">
*-*-myos*)
NATIVE_LIB_DIRS='/lib /local/lib'
;;
</syntaxhighlight>
 
=== Start Files Directory ===
You can modify which directory GCC looks for the crt0.o, crti.o and crtn.o in. The path to that directory is stored in <tt>STANDARD_STARTFILE_PREFIX</tt>. For instance, if you wish to have different locations depending on the processor, you can add the following to gcc/gcc/config/i386/myos64.h:
 
You can modify which directory GCC looks for the crt0.o, crti.o and crtn.o in. The path to that directory is stored in <tt>STANDARD_STARTFILE_PREFIX</tt>. For instance, if you change the library directory to <tt>/lib</tt> in Binutils and want GCC to match, you can add the following to <tt>gcc/config/myos.h</tt>:
<source lang="C">
 
<syntaxhighlight lang="C">
#undef STANDARD_STARTFILE_PREFIX
#define STANDARD_STARTFILE_PREFIX "/x86_64-myos/lib/"
</syntaxhighlight>
</source>
 
Note that the trailing slash is important as the raw crt*.o names are appended without first adding a slash.
 
 
=== Dynamic linking ===
If you want to support dynamic linking with your toolchain, you need to...
* Pass the <tt>"--enable-shared"</tt> flag to the <tt>./configure</tt> scripts of both binutils and gcc
* Set <tt>LINK_SPEC</tt> in your "gcc/config/myos.h" so the gcc driver will pass the correct flags to the linker:
 
<syntaxhighlight lang="C">
#undef LINK_SPEC
#define LINK_SPEC "%{shared:-shared} %{static:-static} %{!shared: %{!static: %{rdynamic:-export-dynamic}}}"
</syntaxhighlight>
 
=== Linker Options ===
 
You can modify the arguments passed to the linker using <tt>LINK_SPEC</tt>. This can be used to force 4KB alignment of sections on 64 bit systems, as ld defaults to 2MB alignment.
 
<syntaxhighlight lang="C">
/* Tell ld to force 4KB pages*/
#undef LINK_SPEC
#define LINK_SPEC "-z max-page-size=4096"
</syntaxhighlight>
 
== Selecting a C Library ==
Line 331 ⟶ 354:
* [[GCC]]
 
[[Category:ToolsToolchains]]
[[Category:Tutorials]]