Tail Recursion and Tail Call Optimization

From OSDev.wiki
Jump to navigation Jump to search

What is a tail call?

Informally, a tail call is when a function calls another function, and returns immediately. In assembly, it's when a CALL and a RET instructions are found right next to each other. For example, in the following code:

 myFunction:
   CALL foo
   CALL bar
   RET

CALL bar is a tail call, but CALL foo is not.

Note that, it's important that there is no operation between CALL bar and RET. Consider the following example:

 int foo(x){
   return bar(x) * 2;
 }

Even though this looks like a tail call, the caller has to modify the return value after the CALL operation. As such, it's not actually a tail call. In assembly:

 CALL bar
 SHL EAX, 1
 RET

However the following is a tail call:

 int foo(x){
   return bar(x * 2);
 }

In assembly:

 SHL EAX, 1
 CALL bar
 RET

As you can see, there is no operation between CALL and RET this time.

Optimizing a tail call

A tail call can be optimized by simply replacing CALL with JMP and removing RET. So looking at this example, one more time:

 int foo(x){
   return bar(x * 2);
 }

In assembly, no tail call optimization:

 SHL EAX, 1
 CALL bar
 RET

In assembly, with tail call optimization:

 SHL EAX, 1
 JMP bar

This has a minor speed up as you no longer have to save and restore the instruction pointer. The callee (in this case, bar) will return all the way back to the caller of this function.

It doesn't have much of an impact on a function like this, but it can have immense impact on recursive calls. A recursive call that takes 1000 recursions will have to save and restore the instruction pointer 1001 times without tail call optimization, but only once with it.

Epilogues and Prologues

In the above examples, it is assumed that functions don't have a prologue or epilogue which is highly impractical. As such a proper implementation of tail call optimization is a lot trickier than shown here. This is only a rudimentary example.