Porting Newlib: Difference between revisions

From OSDev.wiki
Jump to navigation Jump to search
[unchecked revision][unchecked revision]
Content added Content deleted
m (Made the Jeff Johnston quote a bit easier to read by removing the excessive newlines.)
mNo edit summary
 
(32 intermediate revisions by 14 users not shown)
Line 1: Line 1:
{{Rating|2}}
{{Rating|3}}
Newlib is a C library intended for use on embedded systems available under a free software license. It is known for being simple to port to new operating systems. Allegedly, it's coding practices are sometimes questionable. This tutorial follows [[OS Specific Toolchain]] and completes it using newlib rather than using another [[C Library]] such as [[Creating a C Library|your own]].
Everyone wants to do it eventually. Porting newlib is one of the easiest ways to get a simple C library into your operating system without an excessive amount of effort. As an added bonus, once complete you can port the toolchain (GCC/binutils) to your OS - and who wouldn't want to do that?

Porting newlib is one of the easiest ways to get a simple C library into your operating system without an excessive amount of effort. As an added bonus, once complete you can port the toolchain (GCC/binutils) to your OS - and who wouldn't want to do that?

''This article was written with x86 in mind. It has been extended to armv8 through tips and notes in the troubleshooting section.''


== Introduction ==
== Introduction ==
Line 6: Line 10:
I decided that after an incredibly difficult week of trying to get newlib ported to my own OS that I would write a tutorial that outlines the requirements for porting newlib and how to actually do it. I'm assuming you can already load binaries from somewhere and that these binaries are compiled C code. I also assume you have a syscall interface setup already. Why wait? Let's get cracking!
I decided that after an incredibly difficult week of trying to get newlib ported to my own OS that I would write a tutorial that outlines the requirements for porting newlib and how to actually do it. I'm assuming you can already load binaries from somewhere and that these binaries are compiled C code. I also assume you have a syscall interface setup already. Why wait? Let's get cracking!


== Preparation ==


Download newlib source (I'm using 2.5.0) from [ftp://sources.redhat.com/pub/newlib/index.html this ftp server].
== Getting Ready ==


=== Download source code of Automake and Autoconf ===
First of all you need to support a set of 17 system calls that act as 'glue' between newlib and your OS. These calls are the typical "_exit", "_open", "read/write", "execve" (et al). The text below is taken straight from the [http://sourceware.org/newlib/libc.html#Syscalls Red Hat newlib C library] documentation:


Acquire Automake (v1.11) and Autoconf (v2.65) from here:
<pre>
[https://ftp.gnu.org/gnu/automake/automake-1.11.tar.gz]
_exit
[https://ftp.gnu.org/gnu/autoconf/autoconf-2.65.tar.gz]
Exit a program without cleaning up files.
If your system doesn't provide this, it is best to avoid linking with subroutines that
require it (exit, system).


''Note'': The newlib source is organized using "Cygnus style," which is unsupported in Automake versions 1.12 and beyond.
close
Therefore, to be able to build newlib, you need a version less than or equal to 1.11.
Close a file.


Untar both of the archives:
Minimal implementation:
<syntaxhighlight lang="bash">
tar xvf automake-1.11.tar.gz
tar xvf autoconf-2.65.tar.gz
</syntaxhighlight>


Create a destination folder:
int close(int file){
<syntaxhighlight lang="bash">
return -1;
mkdir ~/bin
}
</syntaxhighlight>


Create a build folder:
environ
<syntaxhighlight lang="bash">
A pointer to a list of environment variables and their values.
mkdir build
For a minimal environment, this empty list is adequate:
cd build
</syntaxhighlight>


Configure automake first:
char *__env[1] = { 0 };
<syntaxhighlight lang="bash">
char **environ = __env;
../automake-1.11/configure --prefix="~/bin"
</syntaxhighlight>


Make and install
execve
Transfer control to a new process.


<syntaxhighlight lang="bash">
Minimal implementation (for a system without processes):
make && make install
</syntaxhighlight>
Now lets configure autoconf
<syntaxhighlight lang="bash">
../autoconf-2.65/configure --prefix=~/bin
</syntaxhighlight>
Then make and install:
<syntaxhighlight lang="bash">
make && make install
</syntaxhighlight>


You should now have the proper binaries in ~/bin!
#include <errno.h>
#undef errno
extern int errno;
int execve(char *name, char **argv, char **env){
errno=ENOMEM;
return -1;
}


To add these binaries to your path temporarily
fork
<syntaxhighlight lang="bash">
Create a new process.
export PATH=~/bin:$PATH
</syntaxhighlight>


== System Calls ==
Minimal implementation (for a system without processes):


First of all you need to support a set of 17 system calls that act as 'glue' between newlib and your OS. These calls are the typical "_exit", "open", "read/write", "execve" (et al).
#include <errno.h>
See the [http://sourceware.org/newlib/libc.html#Syscalls Red Hat newlib C library] documentation for an overview of necessary calls.
#undef errno
extern int errno;
int fork() {
errno=EAGAIN;
return -1;
}


Newlib uses a very specific hierarchy of syscalls, many of which can be supplied by more than one file.
fstat
This can quickly lead to symbol redefinition or symbol missing errors when linking with the library.
Status of an open file.
The normal way that newlib expects you to define syscalls, which you may see elsewhere is to define the underscored symbols (e.g. _open instead of open).
In this case, newlib will call the underscored versions using wrappers defined in newlib/libc/syscalls/.
Our different (simplified) approach is to define the syscalls directly. No wrappers.
To do this the newlib_cflags variable must be set to "" in configure.host (default for some platforms, like x86), which will prevent the wrappers from being compiled.


Implementing the syscalls is usually quite trivial, my kernel exposes all the system calls on interrupt 0x80 (128d) so I just had to put a bit of inline assembly into each stub to do what I needed it to do.
For consistency with other minimal implementations in these examples, all files are regarded as
It's up to you how to implement them in relation to your kernel.
character special devices.


== Porting Newlib ==
The `sys/stat.h' header file is distributed in the `include' subdirectory for this C library.


=== config.sub ===
#include <sys/stat.h>
int fstat(int file, struct stat *st) {
st->st_mode = S_IFCHR;
return 0;
}


Same as for binutils in [[OS Specific Toolchain]].
getpid
Process-ID;


=== newlib/configure.host ===
This is sometimes used to generate strings unlikely to conflict with other processes.
Minimal implementation, for a system without processes:


Tell newlib which system-specific directory to use for our particular target. In the section starting 'Get the source directories to use for the host ... case "${host}" in', add a section:
int getpid() {
return 1;
}


<syntaxhighlight lang="bash">
isatty
i[3-7]86-*-myos*)
Query whether output stream is a terminal.
sys_dir=myos
;;
</syntaxhighlight>


configure.host contains two switch clauses, make sure that your variables are not overwritten later!
For consistency with the other minimal implementations, which only support output to stdout,
For example, for aarch64 platforms it sets the syscall_dir variable after us, breaking the library.
this minimal implementation is suggested:


=== newlib/libc/sys/configure.in ===
int isatty(int file){
return 1;
}


Tell the newlib build system that it also needs to configure our myos-specific host directory. In the <tt>case ${sys_dir} in</tt> list, simply add
kill
Send a signal.


<syntaxhighlight lang="bash">
Minimal implementation:
myos) AC_CONFIG_SUBDIRS(myos) ;;
</syntaxhighlight>


'''Note:''' After this, you need to run <tt>autoconf (precisely version 2.64)</tt> in the libc/sys directory.
#include <errno.h>
#undef errno
extern int errno;
int kill(int pid, int sig){
errno=EINVAL;
return(-1);
}


=== newlib/libc/sys/myos ===
link
Establish a new name for an existing file.


This is a directory that we need to create where we put our OS-specific extensions to newlib. We need to create a minimum of 4 files. You can easily add more files to this directory to define your own os-specific library functions, if you want them to be included in libc.a (and so linked in to every application by default).
Minimal implementation:


=== newlib/libc/sys/myos/crt0.c ===
#include <errno.h>
#undef errno
extern int errno;
int link(char *old, char *new){
errno=EMLINK;
return -1;
}


This file creates crt0.o, which is included in every application. It should define the symbol _start, and then call the main() function, possibly after setting up process-space segment selectors and pushing argc and argv onto the stack. A simple implementation is:
lseek
Set position in a file.


<syntaxhighlight lang="C">
Minimal implementation:
#include <fcntl.h>


extern void exit(int code);
int lseek(int file, int ptr, int dir){
extern int main ();
return 0;
}


void _start() {
open
int ex = main();
Open a file. Minimal implementation:
exit(ex);
}
</syntaxhighlight>


'''Note:''' add in argc and argv support based on how you handle them in your OS
int open(const char *name, int flags, int mode){
return -1;
}


=== newlib/libc/sys/myos/syscalls.c ===
read
Read from a file. Minimal implementation:


This file should contain the implementations for each glue function newlib requires.
int read(int file, char *ptr, int len){
return 0;
}


<syntaxhighlight lang="C">
sbrk
/* note these headers are all provided by newlib - you don't need to provide them */
Increase program data space.
#include <sys/stat.h>
As malloc and related functions depend on this, it is useful to have a working implementation.
#include <sys/types.h>
#include <sys/fcntl.h>
The following suffices for a standalone system;
#include <sys/times.h>
it exploits the symbol end automatically defined by the GNU linker.
#include <sys/errno.h>
#include <sys/time.h>
#include <stdio.h>


void _exit();
caddr_t sbrk(int incr){
int close(int file);
extern char end; /* Defined by the linker */
char **environ; /* pointer to array of char * strings that define the current environment variables */
static char *heap_end;
char *prev_heap_end;
int execve(char *name, char **argv, char **env);
int fork();
int fstat(int file, struct stat *st);
if (heap_end == 0) {
int getpid();
heap_end = &end;
int isatty(int file);
}
int kill(int pid, int sig);
prev_heap_end = heap_end;
int link(char *old, char *new);
if (heap_end + incr > stack_ptr)
int lseek(int file, int ptr, int dir);
{
int open(const char *name, int flags, ...);
_write (1, "Heap and stack collision\n", 25);
int read(int file, char *ptr, int len);
abort ();
caddr_t sbrk(int incr);
}
int stat(const char *file, struct stat *st);
clock_t times(struct tms *buf);
int unlink(char *name);
int wait(int *status);
int write(int file, char *ptr, int len);
int gettimeofday(struct timeval *p, struct timezone *z);
</syntaxhighlight>


'''Note''': You may split this up into multiple files, just don't forget to link against all of them in Makefile.am.
heap_end += incr;
return (caddr_t) prev_heap_end;
}


=== newlib/libc/sys/myos/configure.in ===
stat
Status of a file (by name).
Minimal implementation:


Configure script for our system directory.
int stat(const char *file, struct stat *st) {
st->st_mode = S_IFCHR;
return 0;
}


<syntaxhighlight lang="bash">
times
AC_PREREQ(2.59)
Timing information for current process.
AC_INIT([newlib], [NEWLIB_VERSION])
AC_CONFIG_SRCDIR([crt0.c])
AC_CONFIG_AUX_DIR(../../../..)
NEWLIB_CONFIGURE(../../..)
AC_CONFIG_FILES([Makefile])
AC_OUTPUT
</syntaxhighlight>


=== newlib/libc/sys/myos/Makefile.am ===
Minimal implementation:
clock_t times(struct tms *buf){
return -1;
}


A Makefile template for this directory:
unlink
Remove a file's directory entry.


<syntaxhighlight lang="make">
Minimal implementation:
AUTOMAKE_OPTIONS = cygnus
INCLUDES = $(NEWLIB_CFLAGS) $(CROSS_CFLAGS) $(TARGET_CFLAGS)
AM_CCASFLAGS = $(INCLUDES)


noinst_LIBRARIES = lib.a
#include <errno.h>
#undef errno
extern int errno;
int unlink(char *name){
errno=ENOENT;
return -1;
}


if MAY_SUPPLY_SYSCALLS
wait
extra_objs = syscalls.o # add more object files here if you split up
Wait for a child process.
else # syscalls.c into multiple files in the previous step
extra_objs =
endif


lib_a_SOURCES =
Minimal implementation:
lib_a_LIBADD = $(extra_objs)
EXTRA_lib_a_SOURCES = syscalls.c crt0.c # add more source files here if you split up
lib_a_DEPENDENCIES = $(extra_objs) # syscalls.c into multiple files
lib_a_CCASFLAGS = $(AM_CCASFLAGS)
lib_a_CFLAGS = $(AM_CFLAGS)


if MAY_SUPPLY_SYSCALLS
#include <errno.h>
all: crt0.o
#undef errno
endif
extern int errno;
int wait(int *status) {
errno=ECHILD;
return -1;
}


ACLOCAL_AMFLAGS = -I ../../..
write
CONFIG_STATUS_DEPENDENCIES = $(newlib_basedir)/configure.host
Write a character to a file.
</syntaxhighlight>


'''Note''': After this, you need to run <tt>autoconf</tt> in the newlib/libc/sys/ directory, and <tt>autoreconf</tt> in the newlib/libc/sys/myos directory.
`libc' subroutines will use this system routine for output to all files,
including stdout---so if you need to generate any output, for example to a serial port for
debugging, you should make your minimal write capable of doing this.


=== Signal handling ===
The following minimal implementation is an incomplete example; it relies on a writechar
subroutine (not shown; typically, you must write this in assembler from examples provided
by your hardware manufacturer) to actually perform the output.


Newlib has two different mechanisms for dealing with UNIX signals (see the man pages for signal()/raise()). In the first, it provides its own emulation, where it maintains a table of signal handlers in a per-process manner. If you use this method, then you will only be able to respond to signals sent from within the current process. In order to support it, all you need to do is make sure your crt0 calls '_init_signal' before it calls main, which sets up the signal handler table.
int write(int file, char *ptr, int len){
int todo;
for (todo = 0; todo < len; todo++) {
writechar(*ptr++);
}
return len;
}
</pre>


Alternatively, you can provide your own implementation. To do this, you need to define your own version of signal() in syscalls.c. A typical implementation would register the handler somewhere in kernel space, so that issuing a signal from another process causes the corresponding function to be called in the receiving process (this will also require some nifty stack-playing in the receiving process, as you are basically interrupting the program flow in the middle). You then need to provide a kill() function in syscalls.c which actually sends signals to another process. Newlib will still define a raise() function for you, but it is just a stub which calls kill() with the current process id. To switch newlib to this mode, you need to #define the SIGNAL_PROVIDED macro when compiling. A simple way to do this is to add the line:
According to the documentation, you should also redefine errno as an 'extern char':
<source lang="C">
#include <errno.h>
#undef errno
extern int errno;
</source>


<syntaxhighlight lang="bash">
Re-entrant versions of these are a bit harder and are outlined in the documentation.
newlib_cflags="${newlib_cflags} -DSIGNAL_PROVIDED"
</syntaxhighlight>


to your host's entry in <tt>configure.host</tt>. It would probably also make sense to provide sigaction(), and provide signal() as a wrapper for it. Note that [http://pubs.opengroup.org/onlinepubs/9699919799/functions/sigaction.html the Open Group's] definition of sigaction states that 1) sigaction supersedes signal, and 2) an application designed shouldn't use both to manipulate the same signal.
== Using the Kernel's System Calls ==

My kernel exposes all the system calls on interrupt 0x80 (128d) so I just had to put a bit of inline assembly into each stub to do what I needed it to do. It's up to you how to implement them in relation to your kernel.


== Compiling ==
== Compiling ==


You can build newlib in this manner:
Guess what... you're almost there!
Newlib is very pesky about the compiler, and you probably haven't built your own i686-myos-gcc toolchain yet, meaning that configure will not be happy when you set target to i686-myos. So use this hack to get it to work (it worked fine for me).

'''Note:''' there must be a better way then this.

<syntaxhighlight lang="bash">
# newlib setup
CURRDIR=$(pwd)

# make symlinks (a bad hack) to make newlib work
cd ~/cross/bin/ # this is where the bootstrapped generic cross compiler toolchain (i686-elf-xxx) is installed in,
# change this based on your development environment.
ln i686-elf-ar i686-myos-ar
ln i686-elf-as i686-myos-as
ln i686-elf-gcc i686-myos-gcc
ln i686-elf-gcc i686-myos-cc
ln i686-elf-ranlib i686-myos-ranlib

# return
cd $CURRDIR
</syntaxhighlight>


Then run the following commands to build newlib
Download newlib source (I'm using 1.15.0) from [ftp://sources.redhat.com/pub/newlib/index.html this ftp server]. I'm using Cygwin with an ELF cross-compiler (--target=i586-elf), so I put the source into the C:\cygwin\usr\src folder.


<syntaxhighlight lang="bash">
Once the source was downloaded, I loaded up Cygwin. All that needs to be done here is to build newlib:
<source lang="bash">
cd /usr/src
mkdir build-newlib
mkdir build-newlib
cd build-newlib
cd build-newlib
../newlib-''<version>''/configure --prefix=''<location to put libarary>'' --target=i586-elf
../newlib-x.y.z/configure --prefix=/usr --target=i686-myos
make all install
make all
make DESTDIR=${SYSROOT} install
</source>
</syntaxhighlight>


'''Note:''' SYSROOT is where all your OS-specific toolchains will be installed in. It will look like a miniature version of the Linux filesystem, but have your OS-specific toolchains in; I am using ~/myos as my SYSROOT directory.
Simple! However, if you try linking a previously written program with newlib you'll get undefined references everywhere. Why? You haven't yet put your 'glue' into the newlib yet.


'''Note:''' By default, newlib is configured to not support %lld/u format specifiers in printf()/scanf() (i.e. it assumes %lld to mean the same as %ld). In order to override this, should it matter, one must add --enable-newlib-io-long-long to the configure invocation
According to Jeff Johnston on the newlib mailing list:


For some reason, the newer versions of newlib (at least for me) didn't put the libraries in a location where other utilities like binutils could find.
''So, you get the majority of the C library from newlib and the rest (syscalls) is usually in libgloss. Using an ld script makes life easy for the end-user as all they have to do is specify -Txxxx.ld. Inside the ld script you can specify all the libraries needed, where the entry point is, etc.... The libgloss library is a separate library and you name it whatever you want. The ld script handles all of this internally and the user doesn't need to know just what libraries there are out there.''
So here's another hack to fix this:


<syntaxhighlight lang="bash">
Basically, in the libgloss directory you will find 17 files, all of which are the syscalls we wrote earlier. Put your code into these files, configure, build and then you'll have another library. Typically you would rename this library to something like "youros.a" (in my case "mattise.a") and tell all programmers to link with the linker script you write. An added bonus of this is that every executable for your OS uses a link script you've created.
cp -ar $SYSROOT/usr/i686-myos/* $SYSROOT/usr/
</syntaxhighlight>


After building all of this, your freshly built libc will be installed in your SYSROOT directory! Now you can progress to building your own [[OS Specific Toolchain]].
== CRT0 ==


'''Important Note:''' I found that for newlib to properly work, you have to link against libc, libg, libm, and libnosys - hence when porting gcc, in
Finally you must write something called the CRT0, basically the startup code for the executable. My CRT0 is really simple (GAS syntax):
<source lang="asm">
.global _start
.extern main
_start:


<syntaxhighlight lang="C">
## here you might want to get the argc/argv pairs somehow and then push
#define LIB_SPEC ...
## them onto the stack...
</syntaxhighlight>


in gcc/config/myos.h,
# call the user's function
call main


make sure you put
# call the 'kill me' syscall to end
movl $0,%eax
int $0x80


<syntaxhighlight lang="C">
# loop in case we haven't yet rescheduled
#define LIB_SPEC "-lc -lg -lm -lnosys"
lp:
</syntaxhighlight>
hlt
jmp lp
</source>


at the bare minimum.
All programs must link this as the '''first''' object ([[Linker_Scripts|linker script]] to the rescue).


I highly recommend rebuilding the library with your [[OS Specific Toolchain]] after you are done porting one. (don't forget to remove the symlinks, too.)
== Testing ==


== Conclusion ==
I suggest writing a simple test program, the following will suffice:
<source lang="C">
int main()
{
*((ushort_t*) 0xB8000) = 0x7020; // put a gray block in the top left corner
return 0;
}
</source>


Well, you've done it. You've ported newlib to your OS! With this you can start creating user mode programs with ease! You may now also add in new functions to newlib, such as dlopen(), dlclose(), dlsym(), and dlerror() for dynamic linking support. Your operating system has a bright road ahead! You can now port the toolchain and run binutils and GCC on your own OS. Almost self-hosting, how do you feel?
Put a little bit of code into your system call interface to print the function number that has been called and look for any possible calls.


Good luck!
One note about the above... I've already said this but I assume that you can load in an executable binary. Without being able to load an external binary a port of newlib becomes useless - unless you decide to link it into your kernel and use its features (but you still need to write the glue layer, and with different function names from the glue functions).


Last Updated by '''0fb1d8''' for compatibility with newer versions of newlib and the [[OS Specific Toolchain]] tutorial.
== Conclusion ==


'''Note:''' I used a lot of hacks in this article, if you find a better way to do something, please contribute to the page. Thank you.
Well, you've done it. You've ported newlib to your OS! This is a really simple approach to the port but for those who just want to get it done and are happy to put together any special cases later (see the newlib/libc/sys/linux for an example of a special case) then there are heaps of resources out there that can help you out.


== Troubleshooting ==
There is one obvious advantage to porting newlib: you can now port the toolchain and run binutils and GCC on your own OS. Almost self-hosting, how do you feel?


=== Autotools ===
Good luck!
* Whenever you modifiy configure.ac/in files, or not auto-generated Makefile.in files, you must run the appropriate autoconf/automake/autoreconf command
* autoreconf is a tool that automatically calls autotools as required to process the present working working directory. Autoreconf will use the versions of autotools in your path, so make sure to prepend(!) your custom build of autotools to the path variable. There are also supposedly environment variables that can be set.
* *.ac or *.in files are modified by pattern substitution. Lone spaces or tabs WILL cause issues later, since Makefiles are whitespace sensitive and the whitespace is never removed

=== Library Implementation ===
* In some situations, the <code>-DMISSING_SYSCALL_NAMES</code> flag must be set in `newlib_cflags` so that certain functions can call your syscalls as the underscore variant. I.e. sbrk() instead of _sbrk() in your syscalls.c. Otherwise at compile time, the symbol _sbrk will be reported as missing.
* The crt0.o object is overridden by the one generated by <code>libgloss/aarch64</code>. To use your crt file, make sure to override the existing one in your sysroot with the version from the <code>$NEWLIB_BUILD_DIR/$TARGET/newlib/</code> folder after installing newlib.

=== Build System Tips ===
* When porting aarch64 (and perhaps other platforms), not having *-elf at the end of your target string can lead to a circular dependency in one of the Makefiles, causing the build to fail on copying the .spec files. Make sure to add a match statement in the <code>case "${target}" in</code> case in `libgloss/aarch64/configure.in`.
* The build system was rewritten in later versions of NEWLIB. The instructions regarding the automake files will no longer apply. For the most part, it just means leaving out those parts, but be prepared to do your own detective work!

=== General Tips ===
* Copy-pasting may introduce aforementioned missing newlines or spurious spaces.
* Don't move your build artifacts. This may break dependencies.
* If you are stuck, try taking a look at other NEWLIB ports, such as for Jin Xue's https://github.com/Jimx-/lyos, for example.


== See Also ==
== See Also ==


=== Articles ===
=== Articles ===
* [[Boomstick]] a script to build a complete GCC toolchain, including newlib, for your OS. Just fill in the stubs! (or don't :)
* [[GCC Cross-Compiler]]
* [[GCC Cross-Compiler]]
* [[OS Specific Toolchain]]
* [[OS Specific Toolchain]]


[[Category:Porting]]
=== Threads ===
[[Category:C]]

[[Category:Standard Libraries]]
[[Category:Tutorials]]
[[Category:Tutorials]]

Latest revision as of 09:38, 9 June 2024

Difficulty level

Advanced

Newlib is a C library intended for use on embedded systems available under a free software license. It is known for being simple to port to new operating systems. Allegedly, it's coding practices are sometimes questionable. This tutorial follows OS Specific Toolchain and completes it using newlib rather than using another C Library such as your own.

Porting newlib is one of the easiest ways to get a simple C library into your operating system without an excessive amount of effort. As an added bonus, once complete you can port the toolchain (GCC/binutils) to your OS - and who wouldn't want to do that?

This article was written with x86 in mind. It has been extended to armv8 through tips and notes in the troubleshooting section.

Introduction

I decided that after an incredibly difficult week of trying to get newlib ported to my own OS that I would write a tutorial that outlines the requirements for porting newlib and how to actually do it. I'm assuming you can already load binaries from somewhere and that these binaries are compiled C code. I also assume you have a syscall interface setup already. Why wait? Let's get cracking!

Preparation

Download newlib source (I'm using 2.5.0) from this ftp server.

Download source code of Automake and Autoconf

Acquire Automake (v1.11) and Autoconf (v2.65) from here: [1] [2]

Note: The newlib source is organized using "Cygnus style," which is unsupported in Automake versions 1.12 and beyond. Therefore, to be able to build newlib, you need a version less than or equal to 1.11.

Untar both of the archives:

tar xvf automake-1.11.tar.gz
tar xvf autoconf-2.65.tar.gz

Create a destination folder:

mkdir ~/bin

Create a build folder:

mkdir build
cd build

Configure automake first:

../automake-1.11/configure --prefix="~/bin"

Make and install

make && make install

Now lets configure autoconf

../autoconf-2.65/configure --prefix=~/bin

Then make and install:

make && make install

You should now have the proper binaries in ~/bin!

To add these binaries to your path temporarily

export PATH=~/bin:$PATH

System Calls

First of all you need to support a set of 17 system calls that act as 'glue' between newlib and your OS. These calls are the typical "_exit", "open", "read/write", "execve" (et al). See the Red Hat newlib C library documentation for an overview of necessary calls.

Newlib uses a very specific hierarchy of syscalls, many of which can be supplied by more than one file. This can quickly lead to symbol redefinition or symbol missing errors when linking with the library. The normal way that newlib expects you to define syscalls, which you may see elsewhere is to define the underscored symbols (e.g. _open instead of open). In this case, newlib will call the underscored versions using wrappers defined in newlib/libc/syscalls/. Our different (simplified) approach is to define the syscalls directly. No wrappers. To do this the newlib_cflags variable must be set to "" in configure.host (default for some platforms, like x86), which will prevent the wrappers from being compiled.

Implementing the syscalls is usually quite trivial, my kernel exposes all the system calls on interrupt 0x80 (128d) so I just had to put a bit of inline assembly into each stub to do what I needed it to do. It's up to you how to implement them in relation to your kernel.

Porting Newlib

config.sub

Same as for binutils in OS Specific Toolchain.

newlib/configure.host

Tell newlib which system-specific directory to use for our particular target. In the section starting 'Get the source directories to use for the host ... case "${host}" in', add a section:

i[3-7]86-*-myos*)
    sys_dir=myos
    ;;

configure.host contains two switch clauses, make sure that your variables are not overwritten later! For example, for aarch64 platforms it sets the syscall_dir variable after us, breaking the library.

newlib/libc/sys/configure.in

Tell the newlib build system that it also needs to configure our myos-specific host directory. In the case ${sys_dir} in list, simply add

  myos) AC_CONFIG_SUBDIRS(myos) ;;

Note: After this, you need to run autoconf (precisely version 2.64) in the libc/sys directory.

newlib/libc/sys/myos

This is a directory that we need to create where we put our OS-specific extensions to newlib. We need to create a minimum of 4 files. You can easily add more files to this directory to define your own os-specific library functions, if you want them to be included in libc.a (and so linked in to every application by default).

newlib/libc/sys/myos/crt0.c

This file creates crt0.o, which is included in every application. It should define the symbol _start, and then call the main() function, possibly after setting up process-space segment selectors and pushing argc and argv onto the stack. A simple implementation is:

#include <fcntl.h>

extern void exit(int code);
extern int main ();

void _start() {
    int ex = main();
    exit(ex);
}

Note: add in argc and argv support based on how you handle them in your OS

newlib/libc/sys/myos/syscalls.c

This file should contain the implementations for each glue function newlib requires.

/* note these headers are all provided by newlib - you don't need to provide them */
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/fcntl.h>
#include <sys/times.h>
#include <sys/errno.h>
#include <sys/time.h>
#include <stdio.h>

void _exit();
int close(int file);
char **environ; /* pointer to array of char * strings that define the current environment variables */
int execve(char *name, char **argv, char **env);
int fork();
int fstat(int file, struct stat *st);
int getpid();
int isatty(int file);
int kill(int pid, int sig);
int link(char *old, char *new);
int lseek(int file, int ptr, int dir);
int open(const char *name, int flags, ...);
int read(int file, char *ptr, int len);
caddr_t sbrk(int incr);
int stat(const char *file, struct stat *st);
clock_t times(struct tms *buf);
int unlink(char *name);
int wait(int *status);
int write(int file, char *ptr, int len);
int gettimeofday(struct timeval *p, struct timezone *z);

Note: You may split this up into multiple files, just don't forget to link against all of them in Makefile.am.

newlib/libc/sys/myos/configure.in

Configure script for our system directory.

AC_PREREQ(2.59)
AC_INIT([newlib], [NEWLIB_VERSION])
AC_CONFIG_SRCDIR([crt0.c])
AC_CONFIG_AUX_DIR(../../../..)
NEWLIB_CONFIGURE(../../..)
AC_CONFIG_FILES([Makefile])
AC_OUTPUT

newlib/libc/sys/myos/Makefile.am

A Makefile template for this directory:

AUTOMAKE_OPTIONS = cygnus
INCLUDES = $(NEWLIB_CFLAGS) $(CROSS_CFLAGS) $(TARGET_CFLAGS)
AM_CCASFLAGS = $(INCLUDES)

noinst_LIBRARIES = lib.a

if MAY_SUPPLY_SYSCALLS
extra_objs = syscalls.o # add more object files here if you split up
else                    # syscalls.c into multiple files in the previous step
extra_objs =
endif

lib_a_SOURCES =
lib_a_LIBADD = $(extra_objs)
EXTRA_lib_a_SOURCES = syscalls.c crt0.c # add more source files here if you split up
lib_a_DEPENDENCIES = $(extra_objs)      # syscalls.c into multiple files
lib_a_CCASFLAGS = $(AM_CCASFLAGS)
lib_a_CFLAGS = $(AM_CFLAGS)

if MAY_SUPPLY_SYSCALLS
all: crt0.o
endif

ACLOCAL_AMFLAGS = -I ../../..
CONFIG_STATUS_DEPENDENCIES = $(newlib_basedir)/configure.host

Note: After this, you need to run autoconf in the newlib/libc/sys/ directory, and autoreconf in the newlib/libc/sys/myos directory.

Signal handling

Newlib has two different mechanisms for dealing with UNIX signals (see the man pages for signal()/raise()). In the first, it provides its own emulation, where it maintains a table of signal handlers in a per-process manner. If you use this method, then you will only be able to respond to signals sent from within the current process. In order to support it, all you need to do is make sure your crt0 calls '_init_signal' before it calls main, which sets up the signal handler table.

Alternatively, you can provide your own implementation. To do this, you need to define your own version of signal() in syscalls.c. A typical implementation would register the handler somewhere in kernel space, so that issuing a signal from another process causes the corresponding function to be called in the receiving process (this will also require some nifty stack-playing in the receiving process, as you are basically interrupting the program flow in the middle). You then need to provide a kill() function in syscalls.c which actually sends signals to another process. Newlib will still define a raise() function for you, but it is just a stub which calls kill() with the current process id. To switch newlib to this mode, you need to #define the SIGNAL_PROVIDED macro when compiling. A simple way to do this is to add the line:

newlib_cflags="${newlib_cflags} -DSIGNAL_PROVIDED"

to your host's entry in configure.host. It would probably also make sense to provide sigaction(), and provide signal() as a wrapper for it. Note that the Open Group's definition of sigaction states that 1) sigaction supersedes signal, and 2) an application designed shouldn't use both to manipulate the same signal.

Compiling

You can build newlib in this manner: Newlib is very pesky about the compiler, and you probably haven't built your own i686-myos-gcc toolchain yet, meaning that configure will not be happy when you set target to i686-myos. So use this hack to get it to work (it worked fine for me).

Note: there must be a better way then this.

# newlib setup
CURRDIR=$(pwd)

# make symlinks (a bad hack) to make newlib work
cd ~/cross/bin/ # this is where the bootstrapped generic cross compiler toolchain (i686-elf-xxx) is installed in,
                # change this based on your development environment.
ln i686-elf-ar i686-myos-ar
ln i686-elf-as i686-myos-as
ln i686-elf-gcc i686-myos-gcc
ln i686-elf-gcc i686-myos-cc
ln i686-elf-ranlib i686-myos-ranlib

# return
cd $CURRDIR

Then run the following commands to build newlib

mkdir build-newlib
cd build-newlib
../newlib-x.y.z/configure --prefix=/usr --target=i686-myos
make all
make DESTDIR=${SYSROOT} install

Note: SYSROOT is where all your OS-specific toolchains will be installed in. It will look like a miniature version of the Linux filesystem, but have your OS-specific toolchains in; I am using ~/myos as my SYSROOT directory.

Note: By default, newlib is configured to not support %lld/u format specifiers in printf()/scanf() (i.e. it assumes %lld to mean the same as %ld). In order to override this, should it matter, one must add --enable-newlib-io-long-long to the configure invocation

For some reason, the newer versions of newlib (at least for me) didn't put the libraries in a location where other utilities like binutils could find. So here's another hack to fix this:

cp -ar $SYSROOT/usr/i686-myos/* $SYSROOT/usr/

After building all of this, your freshly built libc will be installed in your SYSROOT directory! Now you can progress to building your own OS Specific Toolchain.

Important Note: I found that for newlib to properly work, you have to link against libc, libg, libm, and libnosys - hence when porting gcc, in

#define LIB_SPEC ...

in gcc/config/myos.h,

make sure you put

#define LIB_SPEC "-lc -lg -lm -lnosys"

at the bare minimum.

I highly recommend rebuilding the library with your OS Specific Toolchain after you are done porting one. (don't forget to remove the symlinks, too.)

Conclusion

Well, you've done it. You've ported newlib to your OS! With this you can start creating user mode programs with ease! You may now also add in new functions to newlib, such as dlopen(), dlclose(), dlsym(), and dlerror() for dynamic linking support. Your operating system has a bright road ahead! You can now port the toolchain and run binutils and GCC on your own OS. Almost self-hosting, how do you feel?

Good luck!

Last Updated by 0fb1d8 for compatibility with newer versions of newlib and the OS Specific Toolchain tutorial.

Note: I used a lot of hacks in this article, if you find a better way to do something, please contribute to the page. Thank you.

Troubleshooting

Autotools

  • Whenever you modifiy configure.ac/in files, or not auto-generated Makefile.in files, you must run the appropriate autoconf/automake/autoreconf command
  • autoreconf is a tool that automatically calls autotools as required to process the present working working directory. Autoreconf will use the versions of autotools in your path, so make sure to prepend(!) your custom build of autotools to the path variable. There are also supposedly environment variables that can be set.
  • *.ac or *.in files are modified by pattern substitution. Lone spaces or tabs WILL cause issues later, since Makefiles are whitespace sensitive and the whitespace is never removed

Library Implementation

  • In some situations, the -DMISSING_SYSCALL_NAMES flag must be set in `newlib_cflags` so that certain functions can call your syscalls as the underscore variant. I.e. sbrk() instead of _sbrk() in your syscalls.c. Otherwise at compile time, the symbol _sbrk will be reported as missing.
  • The crt0.o object is overridden by the one generated by libgloss/aarch64. To use your crt file, make sure to override the existing one in your sysroot with the version from the $NEWLIB_BUILD_DIR/$TARGET/newlib/ folder after installing newlib.

Build System Tips

  • When porting aarch64 (and perhaps other platforms), not having *-elf at the end of your target string can lead to a circular dependency in one of the Makefiles, causing the build to fail on copying the .spec files. Make sure to add a match statement in the case "${target}" in case in `libgloss/aarch64/configure.in`.
  • The build system was rewritten in later versions of NEWLIB. The instructions regarding the automake files will no longer apply. For the most part, it just means leaving out those parts, but be prepared to do your own detective work!

General Tips

  • Copy-pasting may introduce aforementioned missing newlines or spurious spaces.
  • Don't move your build artifacts. This may break dependencies.
  • If you are stuck, try taking a look at other NEWLIB ports, such as for Jin Xue's https://github.com/Jimx-/lyos, for example.

See Also

Articles