Porting Newlib
Difficulty level |
---|
Advanced |
Newlib is a C library intended for use on embedded systems available under a free software license. It is known for being simple to port to new operating systems. Allegedly, it's coding practices are sometimes questionable. This tutorial follows OS Specific Toolchain and completes it using newlib rather than using another C Library such as your own.
Porting newlib is one of the easiest ways to get a simple C library into your operating system without an excessive amount of effort. As an added bonus, once complete you can port the toolchain (GCC/binutils) to your OS - and who wouldn't want to do that?
This article was written with x86 in mind. It has been extended to armv8 through tips and notes in the troubleshooting section.
Introduction
I decided that after an incredibly difficult week of trying to get newlib ported to my own OS that I would write a tutorial that outlines the requirements for porting newlib and how to actually do it. I'm assuming you can already load binaries from somewhere and that these binaries are compiled C code. I also assume you have a syscall interface setup already. Why wait? Let's get cracking!
Preparation
Download newlib source (I'm using 2.5.0) from this ftp server.
Download source code of Automake and Autoconf
Acquire Automake (v1.11) and Autoconf (v2.65) from here: [1] [2]
Note: The newlib source is organized using "Cygnus style," which is unsupported in Automake versions 1.12 and beyond. Therefore, to be able to build newlib, you need a version less than or equal to 1.11.
Untar both of the archives:
tar xvf automake-1.11.tar.gz
tar xvf autoconf-2.65.tar.gz
Create a destination folder:
mkdir ~/bin
Create a build folder:
mkdir build
cd build
Configure automake first:
../automake-1.11/configure --prefix="~/bin"
Make and install
make && make install
Now lets configure autoconf
../autoconf-2.65/configure --prefix=~/bin
Then make and install:
make && make install
You should now have the proper binaries in ~/bin!
To add these binaries to your path temporarily
export PATH=~/bin:$PATH
System Calls
First of all you need to support a set of 17 system calls that act as 'glue' between newlib and your OS. These calls are the typical "_exit", "open", "read/write", "execve" (et al). See the Red Hat newlib C library documentation for an overview of necessary calls.
Newlib uses a very specific hierarchy of syscalls, many of which can be supplied by more than one file. This can quickly lead to symbol redefinition or symbol missing errors when linking with the library. The normal way that newlib expects you to define syscalls, which you may see elsewhere is to define the underscored symbols (e.g. _open instead of open). In this case, newlib will call the underscored versions using wrappers defined in newlib/libc/syscalls/. Our different (simplified) approach is to define the syscalls directly. No wrappers. To do this the newlib_cflags variable must be set to "" in configure.host (default for some platforms, like x86), which will prevent the wrappers from being compiled.
Implementing the syscalls is usually quite trivial, my kernel exposes all the system calls on interrupt 0x80 (128d) so I just had to put a bit of inline assembly into each stub to do what I needed it to do. It's up to you how to implement them in relation to your kernel.
Porting Newlib
config.sub
Same as for binutils in OS Specific Toolchain.
newlib/configure.host
Tell newlib which system-specific directory to use for our particular target. In the section starting 'Get the source directories to use for the host ... case "${host}" in', add a section:
i[3-7]86-*-myos*)
sys_dir=myos
;;
configure.host contains two switch clauses, make sure that your variables are not overwritten later! For example, for aarch64 platforms it sets the syscall_dir variable after us, breaking the library.
newlib/libc/sys/configure.in
Tell the newlib build system that it also needs to configure our myos-specific host directory. In the case ${sys_dir} in
list, simply add
myos) AC_CONFIG_SUBDIRS(myos) ;;
Note: After this, you need to run autoconf (precisely version 2.64)
in the libc/sys directory.
newlib/libc/sys/myos
This is a directory that we need to create where we put our OS-specific extensions to newlib. We need to create a minimum of 4 files. You can easily add more files to this directory to define your own os-specific library functions, if you want them to be included in libc.a (and so linked in to every application by default).
newlib/libc/sys/myos/crt0.c
This file creates crt0.o, which is included in every application. It should define the symbol _start, and then call the main() function, possibly after setting up process-space segment selectors and pushing argc and argv onto the stack. A simple implementation is:
#include <fcntl.h>
extern void exit(int code);
extern int main ();
void _start() {
int ex = main();
exit(ex);
}
Note: add in argc and argv support based on how you handle them in your OS
newlib/libc/sys/myos/syscalls.c
This file should contain the implementations for each glue function newlib requires.
/* note these headers are all provided by newlib - you don't need to provide them */
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/fcntl.h>
#include <sys/times.h>
#include <sys/errno.h>
#include <sys/time.h>
#include <stdio.h>
void _exit();
int close(int file);
char **environ; /* pointer to array of char * strings that define the current environment variables */
int execve(char *name, char **argv, char **env);
int fork();
int fstat(int file, struct stat *st);
int getpid();
int isatty(int file);
int kill(int pid, int sig);
int link(char *old, char *new);
int lseek(int file, int ptr, int dir);
int open(const char *name, int flags, ...);
int read(int file, char *ptr, int len);
caddr_t sbrk(int incr);
int stat(const char *file, struct stat *st);
clock_t times(struct tms *buf);
int unlink(char *name);
int wait(int *status);
int write(int file, char *ptr, int len);
int gettimeofday(struct timeval *p, struct timezone *z);
Note: You may split this up into multiple files, just don't forget to link against all of them in Makefile.am.
newlib/libc/sys/myos/configure.in
Configure script for our system directory.
AC_PREREQ(2.59)
AC_INIT([newlib], [NEWLIB_VERSION])
AC_CONFIG_SRCDIR([crt0.c])
AC_CONFIG_AUX_DIR(../../../..)
NEWLIB_CONFIGURE(../../..)
AC_CONFIG_FILES([Makefile])
AC_OUTPUT
newlib/libc/sys/myos/Makefile.am
A Makefile template for this directory:
AUTOMAKE_OPTIONS = cygnus
INCLUDES = $(NEWLIB_CFLAGS) $(CROSS_CFLAGS) $(TARGET_CFLAGS)
AM_CCASFLAGS = $(INCLUDES)
noinst_LIBRARIES = lib.a
if MAY_SUPPLY_SYSCALLS
extra_objs = syscalls.o # add more object files here if you split up
else # syscalls.c into multiple files in the previous step
extra_objs =
endif
lib_a_SOURCES =
lib_a_LIBADD = $(extra_objs)
EXTRA_lib_a_SOURCES = syscalls.c crt0.c # add more source files here if you split up
lib_a_DEPENDENCIES = $(extra_objs) # syscalls.c into multiple files
lib_a_CCASFLAGS = $(AM_CCASFLAGS)
lib_a_CFLAGS = $(AM_CFLAGS)
if MAY_SUPPLY_SYSCALLS
all: crt0.o
endif
ACLOCAL_AMFLAGS = -I ../../..
CONFIG_STATUS_DEPENDENCIES = $(newlib_basedir)/configure.host
Note: After this, you need to run autoconf
in the newlib/libc/sys/ directory, and autoreconf
in the newlib/libc/sys/myos directory.
Signal handling
Newlib has two different mechanisms for dealing with UNIX signals (see the man pages for signal()/raise()). In the first, it provides its own emulation, where it maintains a table of signal handlers in a per-process manner. If you use this method, then you will only be able to respond to signals sent from within the current process. In order to support it, all you need to do is make sure your crt0 calls '_init_signal' before it calls main, which sets up the signal handler table.
Alternatively, you can provide your own implementation. To do this, you need to define your own version of signal() in syscalls.c. A typical implementation would register the handler somewhere in kernel space, so that issuing a signal from another process causes the corresponding function to be called in the receiving process (this will also require some nifty stack-playing in the receiving process, as you are basically interrupting the program flow in the middle). You then need to provide a kill() function in syscalls.c which actually sends signals to another process. Newlib will still define a raise() function for you, but it is just a stub which calls kill() with the current process id. To switch newlib to this mode, you need to #define the SIGNAL_PROVIDED macro when compiling. A simple way to do this is to add the line:
newlib_cflags="${newlib_cflags} -DSIGNAL_PROVIDED"
to your host's entry in configure.host
. It would probably also make sense to provide sigaction(), and provide signal() as a wrapper for it. Note that the Open Group's definition of sigaction states that 1) sigaction supersedes signal, and 2) an application designed shouldn't use both to manipulate the same signal.
Compiling
You can build newlib in this manner: Newlib is very pesky about the compiler, and you probably haven't built your own i686-myos-gcc toolchain yet, meaning that configure will not be happy when you set target to i686-myos. So use this hack to get it to work (it worked fine for me).
Note: there must be a better way then this.
# newlib setup
CURRDIR=$(pwd)
# make symlinks (a bad hack) to make newlib work
cd ~/cross/bin/ # this is where the bootstrapped generic cross compiler toolchain (i686-elf-xxx) is installed in,
# change this based on your development environment.
ln i686-elf-ar i686-myos-ar
ln i686-elf-as i686-myos-as
ln i686-elf-gcc i686-myos-gcc
ln i686-elf-gcc i686-myos-cc
ln i686-elf-ranlib i686-myos-ranlib
# return
cd $CURRDIR
Then run the following commands to build newlib
mkdir build-newlib
cd build-newlib
../newlib-x.y.z/configure --prefix=/usr --target=i686-myos
make all
make DESTDIR=${SYSROOT} install
Note: SYSROOT is where all your OS-specific toolchains will be installed in. It will look like a miniature version of the Linux filesystem, but have your OS-specific toolchains in; I am using ~/myos as my SYSROOT directory.
Note: By default, newlib is configured to not support %lld/u format specifiers in printf()/scanf() (i.e. it assumes %lld to mean the same as %ld). In order to override this, should it matter, one must add --enable-newlib-io-long-long to the configure invocation
For some reason, the newer versions of newlib (at least for me) didn't put the libraries in a location where other utilities like binutils could find. So here's another hack to fix this:
cp -ar $SYSROOT/usr/i686-myos/* $SYSROOT/usr/
After building all of this, your freshly built libc will be installed in your SYSROOT directory! Now you can progress to building your own OS Specific Toolchain.
Important Note: I found that for newlib to properly work, you have to link against libc, libg, libm, and libnosys - hence when porting gcc, in
#define LIB_SPEC ...
in gcc/config/myos.h,
make sure you put
#define LIB_SPEC "-lc -lg -lm -lnosys"
at the bare minimum.
I highly recommend rebuilding the library with your OS Specific Toolchain after you are done porting one. (don't forget to remove the symlinks, too.)
Conclusion
Well, you've done it. You've ported newlib to your OS! With this you can start creating user mode programs with ease! You may now also add in new functions to newlib, such as dlopen(), dlclose(), dlsym(), and dlerror() for dynamic linking support. Your operating system has a bright road ahead! You can now port the toolchain and run binutils and GCC on your own OS. Almost self-hosting, how do you feel?
Good luck!
Last Updated by 0fb1d8 for compatibility with newer versions of newlib and the OS Specific Toolchain tutorial.
Note: I used a lot of hacks in this article, if you find a better way to do something, please contribute to the page. Thank you.
Troubleshooting
Autotools
- Whenever you modifiy configure.ac/in files, or not auto-generated Makefile.in files, you must run the appropriate autoconf/automake/autoreconf command
- autoreconf is a tool that automatically calls autotools as required to process the present working working directory. Autoreconf will use the versions of autotools in your path, so make sure to prepend(!) your custom build of autotools to the path variable. There are also supposedly environment variables that can be set.
- *.ac or *.in files are modified by pattern substitution. Lone spaces or tabs WILL cause issues later, since Makefiles are whitespace sensitive and the whitespace is never removed
Library Implementation
- In some situations, the
-DMISSING_SYSCALL_NAMES
flag must be set in `newlib_cflags` so that certain functions can call your syscalls as the underscore variant. I.e. sbrk() instead of _sbrk() in your syscalls.c. Otherwise at compile time, the symbol _sbrk will be reported as missing. - The crt0.o object is overridden by the one generated by
libgloss/aarch64
. To use your crt file, make sure to override the existing one in your sysroot with the version from the$NEWLIB_BUILD_DIR/$TARGET/newlib/
folder after installing newlib.
Build System Tips
- When porting aarch64 (and perhaps other platforms), not having *-elf at the end of your target string can lead to a circular dependency in one of the Makefiles, causing the build to fail on copying the .spec files. Make sure to add a match statement in the
case "${target}" in
case in `libgloss/aarch64/configure.in`. - The build system was rewritten in later versions of NEWLIB. The instructions regarding the automake files will no longer apply. For the most part, it just means leaving out those parts, but be prepared to do your own detective work!
General Tips
- Copy-pasting may introduce aforementioned missing newlines or spurious spaces.
- Don't move your build artifacts. This may break dependencies.
- If you are stuck, try taking a look at other NEWLIB ports, such as for Jin Xue's https://github.com/Jimx-/lyos, for example.