Calling Conventions: Difference between revisions

[unchecked revision][unchecked revision]
Content deleted Content added
Combuster (talk | contribs)
Undo revision 8456 by Cic (Talk) - Categorisation required, no better place
m Reverted edits by Melina148 (talk) to last revision by Moonchild
 
(18 intermediate revisions by 9 users not shown)
Line 5:
==Basics==
As a general rule, a function which follows the C calling conventions, and is appropriately declared (see below) in the C headers, can be called as a normal C function. Most of the burden for following the calling rules falls upon the assembly program.
 
== Cheat Sheets ==
 
Here is a quick overview of common calling conventions. Note that the calling conventions are usually more complex than represented here (for instance, how is a large struct returned? How about a struct that fits in two registers? How about va_list's?). Look up the specifications if you want to be certain. It may be useful to write a test function and use gcc -S to see how the compiler generates code, which may give a hint of how the calling convention specification should be interpreted.
 
{| {{wikitable}}
! Platform
! Return Value
! Parameter Registers
! Additional Parameters
! Stack Alignment
! Scratch Registers
! Preserved Registers
! Call List
|-
| System V i386 || eax, edx || none || stack (right to left)<sup>[[#Note1|1]]</sup> || || eax, ecx, edx || ebx, esi, edi, ebp, esp || ebp
|-
| System V X86_64<sup>[[#Note2|2]]</sup> || rax, rdx || rdi, rsi, rdx, rcx, r8, r9 || stack (right to left)<sup>[[#Note1|1]]</sup> || 16-byte at call<sup>[[#Note3|3]]</sup> || rax, rdi, rsi, rdx, rcx, r8, r9, r10, r11 || rbx, rsp, rbp, r12, r13, r14, r15 || rbp
|-
| Microsoft x64 || rax || rcx, rdx, r8, r9 || stack (right to left)<sup>[[#Note1|1]]</sup> || 16-byte at call<sup>[[#Note3|3]]</sup> || rax, rcx, rdx, r8, r9, r10, r11 || rbx, rdi, rsi, rsp, rbp, r12, r13, r14, r15 || rbp
|-
| ARM || r0, r1 || r0, r1, r2, r3 || stack || 8 byte<sup>[[#Note4|4]]</sup> || r0, r1, r2, r3, r12 || r4, r5, r6, r7, r8, r9, r10, r11, r13, r14 ||
|}
 
<small id="Note2">Note 1: The called function is allowed to modify the arguments on the stack and the caller must not assume the stack parameters are preserved. The caller should clean up the stack.</small>
 
<small id="Note2">Note 2: There is a 128 byte area below the stack called the 'red zone', which may be used by leaf functions without increasing %rsp. This requires the kernel to increase %rsp by an additional 128 bytes upon signals in user-space. This is <em>not</em> done by the CPU - if interrupts use the current stack (as with kernel code), and the red zone is enabled (default), then interrupts will silently corrupt the stack. Always pass -mno-red-zone to kernel code (even support libraries such as libc's embedded in the kernel) if interrupts don't respect the red zone.</small>
 
<small id="Note3">Note 3: Stack is 16 byte aligned at time of call. The call pushes %rip, so the stack is 16-byte aligned again if the callee pushes %rbp.</small>
 
<small id="Note4">Note 4: Stack is 8 byte aligned at all times outside of prologue/epilogue of function.</small>
 
== System V ABI ==
{{Main|System V ABI}}
 
The System V ABI is one of the major ABIs in use today and is virtually universal among Unix systems. It is the calling convention used by toolchains such as <tt>i686-elf-gcc</tt> and <tt>x86_64-elf-gcc</tt>.
 
==External References==
In order to call a foreign function from C, it must have a correct C prototype. Thus, is if the function <tt>fee()</tt> takes the arguments fie, foe, and fum, in C calling order, and returns an integer value, then the corresponding header file should have the following prototype:
 
<syntaxhighlight lang="c">
int fee(int fie, char foe, double fum);
</syntaxhighlight>
 
Similarly, an global variables in the assembly code must be declared <tt>extern</tt>:
 
<syntaxhighlight lang="c">
extern int frotz;
</syntaxhighlight>
 
C functions in assembly or other languages must be declared as appropriate for the language. For example, in NASM, the C function
 
<syntaxhighlight lang="c">
int foo(int bar, char baz, double quux);
</syntaxhighlight>
 
would be declared
 
<syntaxhighlight lang="c">
extern foo
</syntaxhighlight>
 
Also, in most assembly languages, a function or variable that it to be exported must be declared global:
 
<syntaxhighlight lang="asm">
global foo
global frotzfoo
global foofrotz
</syntaxhighlight>
 
==Name Mangling==
 
In some object formats ([[a.out]]), the name of a C function is automagically mangled by prepending it with an underscore ('"_'"). Thus, to call a C function <tt>foo()</tt> in assembly with such a format, you must define it as <tt>extern _foo</tt> instead of <tt>extern foo</tt>. This requirement does not apply to most modern formats such as [[COFF]], [[PE]], and [[ELF]].
 
C++ name mangling is much more severe, as the C++ compiler encodes the type information from the parameter list into the symbol. (This is what enables function overloading in C++ in the first place.) The binutilsBinutils package contains the tool <tt>c++filt</tt> that can be used to determine the correct mangled name.
 
==Registers==
Line 38 ⟶ 84:
 
==Passing Function Arguments==
GCC/x86 passes function arguments on the stack. These arguments are pushed in reverse order from their order in the argument list. Furthermore, since the x86 protected-mode stack operations operate on doubleword (32-bit) values, the values are always pushed as a doubleword32-bit value, even if the actual value is less than a full doubleword32-bit value. Thus, for function <tt>foo()</tt>, the value of <tt>quux</tt> (a 48-bit FP value) is pushed first as two doublewords32-bit values, low-dword32-bit-value first; the value of <tt>baz</tt> is pushed as the first byte of in doubleword32-bit value; and then finally <tt>bar</tt> is pushed as a doubleword32-bit value.
 
To pass arguments to a C function, the calling function must push the argument values as described above. Thus, to call foo() from a [[NASM]] assembly program, you would do something like this
 
<syntaxhighlight lang="asm">
push eax ; low dword of quux
push edxeax ; highlow dword32-bit of quux
push bl edx ; bazhigh 32-bit of quux
push ecxbl ; barbaz
push ecx ; bar
call foo
</syntaxhighlight>
 
==Accessing Function Arguments==
In the GCC/x86 C calling convention, the first thing any function that accepts formal arguments should do is push the value of <tt>EBP</tt> (the frame base pointer of the calling function), then copy the value of <tt>ESP</tt> to <tt>EBP</tt>. This sets the function's own frame pointer, which is used to track both the arguments and (in C, or in any properly reentrant assembly code) the local variables.
 
To access arguments passed by a C function, you need to use the <tt>EBP</tt> an offset equal to 4 * (n + 2), where n is the number of the parameter in the argument list (not the number in the order it was pushed by), zero-indexed. The + 2 is an added offset for the calling function's saved frame pointer and return pointer (pushed automatically by <tt>CALL</tt>, and popped by <tt>RET</tt>).
 
Thus, in function <tt>fee</tt>, to move <tt>fie</tt> into <tt>EAX</tt>, <tt>foe</tt> into <tt>BL</tt>, and <tt>fum</tt> into <tt>EAX</tt> and <tt>EDX</tt>, you would write (in NASM):
 
<syntaxhighlight lang="asm">
mov ecx, [ebp + 8] ; fie
mov blecx, [ebp + 128] ; foefie
mov edxbl, [ebp + 1612] ; high dword of fumfoe
mov eaxedx, [ebp + 2016] ; low dword32-bit of fum
mov ecxeax, [ebp + 820] ; fiehigh 32-bit of fum
</syntaxhighlight>
 
As stated earlier, return values in GCC are passed using <tt>EAX</tt> and <tt>EDX</tt>. If a value exceeds 64 bits, it must be passed as a pointer.
Line 66 ⟶ 116:
*[http://www.delorie.com/djgpp/doc/ug/asm/calling.html DJGPP FAQ: GCC calling conventions]
*[http://gul.ime.usp.br/Docs/docs/howto/other-formats/html/HOWTO-INDEX-html/Assembly-HOWTO-5.html Linux Assembly Language HOWTO chapter 5]
*http://myfiles.execpcosdev.comorg/mirrors/~geezer/osd/libc/index.htm
 
[[Category:LanguagesABI]]
[[Category:C]]
[[de:Aufrufkonventionen]]