Languages

There are many programming languages, some more suited for OS development and kernel writing than others. This page will discuss it in depth.

History

Early operating systems were written entirely in the Assembly language dialect of their respective CPU. In modern OSes there are still parts that can only be done in Assembly. Many high-level languages have been used for OS development in the past, including C, Lisp, FORTH, C#, C++, Modula-2, Ada, Bliss, and PL/1. In many languages other than C a fair amount of Assembly and C development is required in order to provide the appropriate runtime environment supporting the language's abstractions.

Warning

Not all languages are suitable for low-level system programming, or have suitable low-level development tools available for them. Even those which are suitable often require specific runtime support, where C does not. Also, the vast majority of OS related resources (like tutorials, and how-to examples, including this Wiki except where noted otherwise) assume C as the primary development language, so an OS developer should at least be able to read C code.

Using a language other than C entails a good deal of extra effort. But there are some developers who are willing to put in that effort in order to suit the development of their OSes to their ways of thinking (for example: programming paradigms).

On the other hand, trying to write an OS in interpreted languages like Perl or Java is unlikely to succeed. There are research projects about it, of course, but nothing that has completely changed the way we write kernels so far. And trying to write it in HTML, PHP or Javascript would just be proof that you have much to learn...

Can I use language XYZ?

If you'd like to know if your favourite language is suited for OSDeving, just consider the following questions

Can you cope with datastructures having a specific bits & bytes arrangement (mandatory for e.g. MMU structures and alike things)?
Can you take control of memory allocation/freeing? Or can you at least subdivide a large chunk of memory in smaller chunks that other functions can use transparently (necessary for any sort of memory management)?
Are you able to build a self-sufficient run-time library to support language features you'll need ?
Can you easily interface XYZ with some assembly code (yes, you'll have some, at least in the run-time library you'll have to write)?
If XYZ fits the other points and is an interpreted language, can you invoke code coming from raw data bytes with XYZ, i.e. jump at a specific address and continue execution there (this will be mandatory for loading and running programs)?

If any of those question turns out to "whoops, no, I cannot do that with language XYZ", then chances are that XYZ will be of no help for OS development - or will have to be mixed with other languages and stubs to function.

Can't I write a compiler for XYZ?

The only thing more complex than writing a compiler is writing an operating system. As you are already planning to do the latter, deciding to do the former also is like finding new ways to forge metals in order to build a better car. The canonical starting point is the "dragon book" ("Compilers - Principles, Techniques, and Tools" located on the Books page). Alta Lang

But I heard of an OS written in language XYZ, isn't it interpreted?

This is a red herring. There is no such thing as an "interpreted language". A language can be implemented using some kind of interpreter or compiler, you would have to check the specific details of that project to know whether it is suitable.

You may from time to time hear of operating systems written in languages which are usually interpreted, or which used an interpreter of some sort: JavaOS, Genera (the Symbolics Lisp Machine OS), Smalltalk-80, UCSD Pascal, the various FORTH systems, etc. Most of these fall into one of three categories :

The operating system runs in a low-level interpreter, written in Assembly or some systems language like C, which is what actually interacts with the hardware. In effect, the 'operating system' is just an application running on top of another, lower-level OS. Smalltalk-80, UCSD Pascal, and JavaOS work like this, though they also have some modules which are compiled to native code as well (see below).

All or part of the code has been compiled to native code. This may involve using a sub-set of the language with reduced runtime requirements (e.g., pre Scheme, or Slang - while they have not been used for OS development to date, they do demonstrate this sort of low-level implementation language which can be used this way).
- FORTH-based operating systems are a special case of this. While FORTH is usually described as an interpreted language, the threaded-code interpreters many FORTH systems use work differently from most other interpreters; in effect, the interpreter walks through the various FORTH 'words' that the code is comprised of until it reaches the low-level words that are implemented in assembly or compiled code, which is what actually gets executed. Furthermore, FORTH systems incorporate a special sort of assembler, which produces code specifically meant to be used by the interpreter; also, commonly used 'words' can be compiled into native code as needed. Finally, many embedded FORTH systems use special-purpose hardware (see below) to support the language.

The system ran on specialized hardware and microcode, which acted as hardwired 'interpreter' for it's primary language, or for the portable bytecode which it normally used. This type of system includes the SOAR (Smalltalk On A RISC), the Recursiv System, The Lillith Modula-2 System, and the Burroughs 6500 (a mainframe designed for running Algol-60 in the 1960s). The system programming techniques for these cannot work on stock hardware. For example:
- The MIT CADR Lisp machine architecture had an extensive instruction set with hardware support for certain high-level operations such as type-tag checking and GC. It had a tagged architecture meaning that a portion of the 36-bit addressing word was designated for type information. Typically these machines had a variety of compilers including one for the system language Lisp which was capable of taking advantage of the additional instruction set.
- The Rekursiv Single-Board Computer had hardware support for a writable instruction set (that is, you could dynamically add microcode instructions) and associative memory dispatch tables for supporting object-oriented programming.