Creating a C Library: Difference between revisions

m
Bot: Replace deprecated source tag with syntaxhighlight
[unchecked revision][unchecked revision]
(__BEGIN_DECLS and __END_DECLS aren't a good idea)
m (Bot: Replace deprecated source tag with syntaxhighlight)
 
(12 intermediate revisions by 7 users not shown)
Line 5:
== Building ==
 
Now that you have an OS-specific freestanding toolchain that can build your kernel, there is a problem of bootstrapping. To build aan OS-specific toolchain that supports a hosted environment, you need the headers of your standard C library installed. To build your libc you need a compiler that supports a hosted environment (building a user-space libc as in a freestanding environment is a logical mistake). The solution is to create a make target that installs your C library and kernel headers into your sysroot's include directory without needing a compiler. Then you can simply build your cross-compiler with --with-sysroot="$SYSROOT" (and not giving the --without-headers option), and you should get a cross-compiler that offers a freestanding environmment for your kernel, and a hosted environment for your user-space. Note that [[libgcc]] may depend on a few things from your C library: if you get errors during building libgcc, then simply add the declarations to your header files. You will get undeclared symbols if GCC deems it appropriate to call a libgcc function that needs a libc symbol, so it is a good idea to implement what libgcc needs early on. Note that [[libgcc]] is <em>not</em> optional, and gcc <em>will</em> emit calls to it if it thinks it is a good idea.
 
Then you can simply build your C library source files using:
 
<sourcesyntaxhighlight lang="bash">
x86_64-myos-gcc -c strfoo.c -o strfoo.o
x86_64-myos-as x86_64/crt0.s -o x86_64/crt0.o
x86_64-myos-ar rcs libc.a strfoo.o x86_64/crt0.o
</syntaxhighlight>
</source>
 
Note that unlike the kernel, you don't need to add extra special options that disable standard include directories and libraries. After all, this is the C library meant for user-space. You don't need to add -I options to the compiler if you already have installed your libc and kernel headers into your sysroot. Otherwise, simply add the appropriate -I options to the above commands. It may be useful to use the same libc for both user-space and the kernel, in that case don't make the mistake of thinking that user-space-libc and kernel-libcs are the same thing. First of all many user-space things don't make sense (or even work) in the kernel. Secondly, a kernel-usable libc must be built differently than the user-space libc because it is <em>freestanding</em>. A kernel libc must have all the special options that a kernel binary is passed. For instance, don't forget to add -mno-red-zone on x86_64 to both your kernel and libc, or interrupts may corrupt the stack.
Line 20:
:''See also: [[Calling Global Constructors]]''
 
The first and most important thing to implement in a C library is the _start function, to which control is passed from your program loader. It'sIts task is to initialize and run the process. Normally this is done by initializing the C library (if needed), then calling the global constructors, and finally calling exit(main(argc, argv)). You can change the name of the default program entry point by adding ENTRY=_my_start_name in your OS-specific binutils emulparams script (binutils/ld/emulparams). You can change which start files are used by modifying gcc/gcc/config/myos.h in your OS-specific GCC. The macros STARTFILE_SPEC and ENDFILE_SPEC define the object files to use in the GCC spec language. See gcc/gcc/config/gnu-user.h for examples on how to use this. If you decide to use the conventional (GNU-like) names and semantics for these initialization files, the following information applies:
 
=== crt0.o ===
Line 26:
 
Below is a simple implementation of crt0.s for x86_64. It assumes that the program loader has put *argv and *envp on the stack, and that %rdi contains argc, %rsi contains argv, %rdx contains envc, and %rcx contains envp
<pre>
<source lang="asm">
.section .text
 
Line 57:
movl %eax, %edi
call exit
.size _init_start, . - _init_start
</source>
</pre>
 
This implementation is careful to set up the end of the stack frame linked list. If you compile your files without optimization or you use -fno-omit-frame-pointer, then each function adds itself to this linked list. This is very useful if you wish to add calltracing support. In that case, you'll need to know when you have reached the end, which is why we add an explicit zero in the above code..
Line 74 ⟶ 75:
 
Hence an crti.s implementation will simply be (x86_64):
<pre>
<source lang="asm">
.section .init
.global _init
Line 88 ⟶ 89:
movq %rsp, %rbp
/* gcc will nicely put the contents of crtbegin.o's .fini section here. */
</sourcepre>
 
and a simple implementation of crtn.s will be (x86_64):
 
<pre>
<source lang="asm">
.section .init
/* gcc will nicely put the contents of crtend.o's .init section here. */
popq %rbp
ret
.size _init, . - _init
 
.section .fini
Line 103:
popq %rbp
ret
</pre>
.size _fini, . - _fini
</source>
 
Finally, you simply need to assemble your crt0.o, crti.o, and crtn.o files and install them in your system library directory. Your _start function is now able to set up the standard library, call the global constructors, and call exit(main(argc, argv)). Don't forget to call your _fini function in your exit function, or the global destructors won't be run, leading to subtle bugs.
 
== Implementing ==
Most of this will be up to you. A good place to start are the mem* and str*-functions, as they are mostly independent of syscalls and other library routines and make unit testing a breeze. clang and gcc, being nice compilers, both provide a set of [https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html builtin] functions that can be used to skip implementing some parts. This approach gives you working versions of these functions, with the tradeoff being that this method is compiler-specific to clang and gcc only. While this is probably fine for a kernel, your mileage may vary in userland.
 
== Standards ==
Line 115 ⟶ 117:
 
Fortunately, you can find plenty of the relevant standards online. For instance, you can look at:
* The C Standard (2011), latest draft is available [http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf here (PDF)].
* The C Standard (2011).
* [http://pubs.opengroup.org/onlinepubs/9699919799/ POSIX (2008)].
* GNU libc documentation.
Line 123 ⟶ 125:
C library headers conventionally use include guards to prevent multiple declarations. It may be wise to use the same format as GNU's libc as some programs (and GCC fixincludes) may erroneously rely on these macros to detect whether a given header has been included. GNU libc's headers usually follow this simple scheme:
 
<sourcesyntaxhighlight lang="c">
#ifndef _STDIO_H
#define _STDIO_H 1
Line 130 ⟶ 132:
 
#endif
</syntaxhighlight>
</source>
 
== Repeated Declarations ==
 
The occasional poor design of the C standard library has lead to the requirement that some declarations must be repeated in multiple header files. Worse, if a header needs the FILE declaration from stdio.h, it is not allowed to include stdio.h because of the namespace polution. This leads to a multiple maintaince problem and can be dangerous if you update the declaration in one place and forget to do it in another. If you have a look at GNU libc's stdio.h header, you will discover a maze of #ifdefs and #define __need_FILE preprocessor magic that allows you to #define __need_FILE #include <stdio.h> and then get <em>only</em> the declaration of FILE. While this works, it is very ugly and grows horribly in complexity.
 
Another solution is to put the FILE in its own header <decl_FILE.H> with its own include guard. This is much cleaner and simpler to understand, but causes extra work for the preprocessor which can get comparably expensive if this is done for a lot of declarations. An alternative solution is to pre-preprocess the header files by writing a simple compiler that automatically finds these <decl_FILE.h> inclusions and inserts the contents into the header file. This tradetrades some build-time-complexity for maintainability and runtime performance.
 
== C++ Compatibility ==
Line 142 ⟶ 144:
Traditionally C++ programs can use the C headers, even though such headers are written in C and have C linkage. Unless you specify otherwise, GCC will assume that header files found in your system include directory are written in C and have C linkage. This is done by GCC automatically inserting extern "C" { ... } around all included system headers. However, this may lead to strange linking-failures if you try to use C++ headers from the system include directory. The key solution is to make your C headers explicitly compatible with C++ and tell GCC that you understand C++.
 
Telling'''For GCC versions < 9''', telling GCC that your headers understand C++ is done by having the following in your OS-specific gcc/gcc/config/myos.h:
 
<sourcesyntaxhighlight lang="c">
/* Don't assume anything about the header files. */
#undef NO_IMPLICIT_EXTERN_C
#define NO_IMPLICIT_EXTERN_C 1
</syntaxhighlight>
</source>
 
'''For GCC versions >= 9''', [https://patchwork.ozlabs.org/patch/934478/ this patch] changed the behaviour; they have made <tt>NO_IMPLICIT_EXTERN_C</tt> a so-called "poisoned identifier", so compiling your OS-specific toolchain will fail. If you read the linked patch above, you will realise that they decided to invert the behaviour, such that system headers are assumed to understand C++ by default. So, in theory, you do not need those two lines above at all, and it should Just Work (tm).
 
If, instead, your system headers indeed ''do not'' support C++ (why?), you can <tt>#define SYSTEM_IMPLICIT_EXTERN_C 1</tt>, as they have done for AIX.
 
Adding support for C++ in your C header files (such as stdio.h) is very simple. Previously GCC automatically added extern "C" around all headers if compiling C++, but now we'll need to do it ourselves. The key feature is that we don't need to do this on C++-only headers, meaning that C++ headers will now actually work, instead of GCC assuming they have C-linkage. For instance, you can change your stdio.h to be of this form:
 
<sourcesyntaxhighlight lang="c">
#ifndef _STDIO_H
#define _STDIO_H 1
Line 167 ⟶ 173:
 
#endif
</syntaxhighlight>
</source>
 
[[Category:C]]
[[Category:Standard Libraries]]
[[Category:Porting]]
[[Category:Tutorials]]