Tail Recursion and Tail Call Optimization
What is a tail call?[edit | edit source]
Informally, a tail call is when a function calls another function, and returns immediately. In assembly, it's when a CALL and a RET instructions are found right next to each other. For example, in the following code:
myFunction:
CALL foo
CALL bar
RET
CALL bar
is a tail call, but CALL foo
is not.
Note that, it's important that there is no operation between CALL bar
and RET
. Consider the following example:
int foo(x){
return bar(x) * 2;
}
Even though this looks like a tail call, the caller has to modify the return value after the CALL
operation. As such, it's not actually a tail call. In assembly:
CALL bar
SHL EAX, 1
RET
However the following is a tail call:
int foo(x){
return bar(x * 2);
}
In assembly:
SHL EAX, 1
CALL bar
RET
As you can see, there is no operation between CALL
and RET
this time.
Optimizing a tail call[edit | edit source]
A tail call can be optimized by simply replacing CALL
with JMP
and removing RET
. So looking at this example, one more time:
int foo(x){
return bar(x * 2);
}
In assembly, no tail call optimization:
SHL EAX, 1
CALL bar
RET
In assembly, with tail call optimization:
SHL EAX, 1
JMP bar
This has a minor speed up as you no longer have to save and restore the instruction pointer. The callee (in this case, bar
) will return all the way back to the caller of this function.
It doesn't have much of an impact on a function like this, but it can have immense impact on recursive calls. A recursive call that takes 1000 recursions will have to save and restore the instruction pointer 1001 times without tail call optimization, but only once with it.
Epilogues and Prologues[edit | edit source]
In the above examples, it is assumed that functions don't have a prologue or epilogue which is highly impractical. As such a proper implementation of tail call optimization is a lot trickier than shown here. This is only a rudimentary example.