Inline Assembly: Difference between revisions

From OSDev.wiki
Jump to navigation Jump to search
[unchecked revision][unchecked revision]
Content added Content deleted
m (Bot: Replace deprecated source tag with syntaxhighlight)
 
(18 intermediate revisions by 12 users not shown)
Line 1: Line 1:
The idea behind '''Inline Assembly''' is is to embed assembler instructions in your C/C++ code, using the asm keyword, when there's no option but to use assembly language.
The idea behind '''Inline Assembly''' is to embed assembler instructions in your C/C++ code, using the <tt>asm</tt> keyword, when there's no option but to use [[Assembly]] language.


==Overview==
== Overview ==
Sometimes, even though C/C++ is your language of choice, you '''need''' to use some asm code in your operating system. Be it because of extreme optimization needs or because the code you're implementing is highly hardware-specific (like, say, outputting data through a port), the result is the same : there's no way around it. You must use assembly.
Sometimes, even though C/C++ is your language of choice, you '''need''' to use some assembler code in your operating system. Be it because of extreme optimization needs or because the code you're implementing is highly hardware-specific (like, say, outputting data through a port), the result is the same: there's no way around it. You must use assembly.


One of the options you have is writing an asm function and calling it, however there can be times when even the "call" overhead is too much for you. In that case, what you need is inline assembly, which means inserting arbitrary assembly snippets in the middle of your code, using the asm() "function". The way this function works is compiler-specific, and this article describes the way it works in GCC since it is by far the most used compiler in the OS world.
One of the options you have is writing an assembly function and calling it, however there can be times when even the "call" overhead is too much for you. In that case, what you need is inline assembly, which means inserting arbitrary assembly snippets in the middle of your code, using the <tt>asm()</tt> keyword. The way this keyword works is compiler-specific. This article describes the way it works in GCC since it is by far the most used compiler in the OS world.


==Syntax==
== Syntax ==
This is the syntax for calling asm() in your C/C++ code:
This is the syntax for using the <tt>asm()</tt> keyword in your C/C++ code:


<source lang="c">
<syntaxhighlight lang="c">
asm ( assembler template
asm ( assembler template
: output operands (optional)
: output operands (optional)
Line 15: Line 15:
: clobbered registers list (optional)
: clobbered registers list (optional)
);
);
</syntaxhighlight>
</source>


Assembler template is basically GAS-compatible code, except that register names now start with %% instead of %. This means that the following code...
Assembler template is basically [[GAS]]-compatible code, except when you have constraints, in which case register names must start with %% instead of %. This means that the following two lines of code will both move the contents of the <tt>eax</tt> register into <tt>ebx</tt>:


<source lang="c">
<syntaxhighlight lang="c">
asm ("movl %%eax, %%ebx");
asm ("movl %eax, %ebx");
asm ("movl %%eax, %%ebx" : );
</source>
</syntaxhighlight>


...will move eax's content into ebx. Now, you may wonder why this %% comes in. This is where an interesting feature of inline assembly comes in : you can make use of some of your C variables in your assembly code. And since, in order to make implementation of this mechanism simpler, GCC names these variables %0, %1, and so on in your assembly code, starting from the first variable mentioned in the input/output operand sections, you're required to use this %% syntax in order to help GCC making a separation between registers and parameters...
Now, you may wonder why this %% comes in. This is where an interesting feature of inline assembly comes in: you can make use of some of your C variables in your assembly code. And since, in order to make implementation of this mechanism simpler, GCC names these variables %0, %1, and so on in your assembly code, starting from the first variable mentioned in the input/output operand sections. You're required to use this %% syntax in order to help GCC differentiate between registers and parameters.


How exactly operands work will be explained in more details in later sections. For now, sufficient is to say that if you write something like that...
How exactly operands work will be explained in more details in later sections. For now, it is sufficient to say that if you write something like that:


<source lang="c">
<syntaxhighlight lang="c">
int a=10, b;
int a=10, b;
asm ("movl %1, %%eax;
asm ("movl %1, %%eax;
Line 35: Line 36:
:"%eax" /* clobbered register */
:"%eax" /* clobbered register */
);
);
</syntaxhighlight>
</source>


You've managed to copy the value of "a" in "b" using assembly code, effectively using some C variables in your assembly code. Congratulations !
then you've managed to copy the value of "a" into "b" using assembly code, effectively using some C variables in your assembly code. Congratulations!


The last "clobbered register" section is used in order to tell GCC that your code is using some of the processor's registers, and that it should move any active data from the running program out of this register before executing the asm snippet. In the example above, we move b to eax in the first instruction, effectively erasing its content, so we need to ask GCC to clear this register from unsaved data before operation.
The last "clobbered register" section is used in order to tell GCC that your code is using some of the processor's registers, and that it should move any active data from the running program out of this register before executing the asm snippet. In the example above, we move <tt>a</tt> to eax in the first instruction, effectively erasing its content, so we need to ask GCC to clear this register from unsaved data before operation.


===Assembler Template===
=== Assembler Template ===
The Assembler Template defines the assembler instructions to inline. The default is to use AT&T syntax here. If you want to use Intel syntax, <tt>-masm=intel</tt> should be specified as a command-line option.
The Assembler Template defines the assembler instructions to inline. The default is to use AT&T syntax here. If you want to use Intel syntax, <tt>-masm=intel</tt> should be specified as a command-line option.


As an example, to halt the CPU, you just have to use the following command:
As an example, to halt the CPU, you just have to use the following command:


<source lang="c">
<syntaxhighlight lang="c">
asm( "hlt" );
asm( "hlt" );
</syntaxhighlight>
</source>


===Output Operands===
=== Output Operands ===
The Output Operands section is used in order to tell the compiler / assembler how it should handle C variables used to store some output from the ASM code. The Output Operands are a list of pairs, each operand consisting of a string literal, known as "constraint", stating where the C variable should be mapped (registers are generally used for optimal performance), and a C variable to map to (in braces).
The Output Operands section is used in order to tell the compiler / assembler how it should handle C variables used to store some output from the ASM code. The Output Operands are a list of pairs, each operand consisting of a string literal, known as "constraint", stating where the C variable should be mapped (registers are generally used for optimal performance), and a C variable to map to (in parentheses).


In the constraint, 'a' refers to EAX, 'b' to EBX, 'c' to ECX, 'd' to EDX, 'S' to ESI, and 'D' to EDI (read the GCC manual for a full list), assuming that you are coding for the IA32 architecture. An equation sign indicates that your assembly code does not care about the initial value of the mapped variable (which allows some optimization). With all that in mind, it's now pretty clear that the following code sets EAX = 0.
In the constraint, 'a' refers to EAX, 'b' to EBX, 'c' to ECX, 'd' to EDX, 'S' to ESI, and 'D' to EDI (read the GCC manual for a full list), assuming that you are coding for the IA32 architecture. An equation sign indicates that your assembly code does not care about the initial value of the mapped variable (which allows some optimization). With all that in mind, it's now pretty clear that the following code sets EAX = 0.


<source lang="c">
<syntaxhighlight lang="c">
int EAX;
int EAX;
asm( "movl $0, %0"
asm( "movl $0, %0"
: "=a" (EAX)
: "=a" (EAX)
);
);
</syntaxhighlight>
</source>


Notice that the compiler enumerates the operand starting with %0, and that you don't have to add a register to the clobbered register list if it's used to store an output operand. GCC is smart enough to figure out what to do all by itself.
Notice that the compiler enumerates the operand starting with %0, and that you don't have to add a register to the clobbered register list if it's used to store an output operand. GCC is smart enough to figure out what to do all by itself.
Line 66: Line 67:
Starting with GCC 3.1, you can use more readable labels instead of the error-prone enumeration:
Starting with GCC 3.1, you can use more readable labels instead of the error-prone enumeration:


<source lang="c">
<syntaxhighlight lang="c">
int current_task;
int current_task;
asm( "str %[output]"
asm( "str %[output]"
: [output] "=r" (current_task)
: [output] "=r" (current_task)
);
);
</syntaxhighlight>
</source>


These labels are in a namespace of their own, and will not collide with any C identifiers. The same can be done for input operands, too.
These labels are in a namespace of their own, and will not collide with any C identifiers. The same can be done for input operands, too.


===Input Operands===
=== Input Operands ===
While the Output Operands are generally used for... well... output, the Input Operands allows to parametrize the ASM code; i.e., passing read-only parameters from C code to ASM block. Again, string literals are used to specify the details.
While the Output Operands are generally used for... well... output, the Input Operands allows to parametrize the ASM code; i.e., passing read-only parameters from C code to ASM block. Again, string literals are used to specify the details.


If you want to move some value to EAX, you can do it the following way (even though it would certainly be pretty useless to do so instead of directly mapping the value to EAX):
If you want to move some value to EAX, you can do it the following way (even though it would certainly be pretty useless to do so instead of directly mapping the value to EAX):


<source lang="c">
<syntaxhighlight lang="c">
int randomness = 4;
int randomness = 4;
asm( "movl %0, %%eax"
asm( "movl %0, %%eax"
:
:
: "b" (randomness)
: "b" (randomness)
: %%eax
: "eax"
);
);
</syntaxhighlight>
</source>


Note that GCC will always assume that input operands are read-only (unchanged). The correct thing to do when input operands are written to is to list them as outputs, but without using the equation sign because this time their original value matters. Here is a simple example:
Note that GCC will always assume that input operands are read-only (unchanged). The correct thing to do when input operands are written to is to list them as outputs, but without using the equation sign because this time their original value matters. Here is a simple example:
<source lang="c">
<syntaxhighlight lang="c">
asm("mov %%eax,%%ebx": : "a" (amount));//useless but it gets the idea
asm("mov %%eax,%%ebx": : "a" (amount));//useless but it gets the idea
</syntaxhighlight>
</source>
Eax will contain "amount" and be moved into ebx.
Eax will contain "amount" and be moved into ebx.


===Clobbered Registers List===
=== Clobbered Registers List ===
It is important to remember one thing: ''The C/C++ compiler knows nothing about Assembler''. For the compiler, the asm statement is opaque, and if you did not specify any output, it might even come to the conclusion that it's a no-op and optimize it away. Some third-party docs indicate that using asm volatile will cause the keyword to not be moved. However, according to the GCC documentation, ''The volatile keyword indicates that the instruction has important side-effects. GCC will not delete a volatile asm if it is reachable.'', which only indicates that it will not be deleted (i.e. whether it may still be moved is an unanswered question). An approach that should work is to use asm (volatile) and put '''memory''' in the clobber registers, like so:
It is important to remember one thing: ''The C/C++ compiler knows nothing about Assembler''. For the compiler, the asm statement is opaque, and if you did not specify any output, it might even come to the conclusion that it's a no-op and optimize it away. Some third-party docs indicate that using asm volatile will cause the keyword to not be moved. However, according to the GCC documentation, ''The volatile keyword indicates that the instruction has important side-effects. GCC will not delete a volatile asm if it is reachable.'', which only indicates that it will not be deleted (i.e. whether it may still be moved is an unanswered question). An approach that should work is to use asm (volatile) and put '''memory''' in the clobber registers, like so:


<source lang="c">
<syntaxhighlight lang="c">
__asm__("cli": : :"memory"); // Will cause the statement not to be moved, but it may be optimized away.
__asm__("cli": : :"memory"); // Will cause the statement not to be moved, but it may be optimized away.
__asm__ __volatile__("cli": : :"memory"); // Will cause the statement not to be moved nor optimized away.
__asm__ __volatile__("cli": : :"memory"); // Will cause the statement not to be moved nor optimized away.
</syntaxhighlight>
</source>


Since the compiler uses CPU registers for internal optimization of your C/C++ variables, and doesn't know about ASM opcodes, you have to warn it about any registers that might get clobbered as a side effect, so the compiler can save their contents before making your ASM call.
Since the compiler uses CPU registers for internal optimization of your C/C++ variables, and doesn't know about ASM opcodes, you have to warn it about any registers that might get clobbered as a side effect, so the compiler can save their contents before making your ASM call.
Line 107: Line 108:
The Clobbered Registers List is a comma-separated list of register names, as string literals.
The Clobbered Registers List is a comma-separated list of register names, as string literals.


===Wildcards: How you can let the compiler choose===
=== Wildcards: How you can let the compiler choose ===
You don't need to tell the compiler which specific register it should use in each operation, and in general, except you have good reasons to prefer one register specifically, you should better let the compiler decide for you.
You don't need to tell the compiler which specific register it should use in each operation, and in general, except you have good reasons to prefer one register specifically, you should better let the compiler decide for you.


Forcing to use EAX over any other register, for instance, may force the compiler to issue code that will save what was previously in eax in some other register or may introduce unwanted dependencies between operations (pipeline optimization broken)
Forcing to use EAX over any other register, for instance, may force the compiler to issue code that will save what was previously in eax in some other register or may introduce unwanted dependencies between operations (pipeline optimization broken)


The 'wildcards' constraints allows you to give more freedom to GCC and when it comes to input/output mapping:
The 'wildcards' constraints allows you to give more freedom to GCC when it comes to input/output mapping:
{| {{wikitable}}
{| {{wikitable}}
|-
|-
| The "g" constraint : <source lang="c">"movl $0, %0" : "=g" (x)</source>
| The "g" constraint : <syntaxhighlight lang="c">"movl $0, %0" : "=g" (x)</syntaxhighlight>
| x can be whatever the compiler prefers: a register, a memory reference. It could even be a literal constant in another context.
| x can be whatever the compiler prefers: a register, a memory reference. It could even be a literal constant in another context.
|-
|-
| The "r" constraint : <source lang="c">"movl %%es, %0" : "=r" (x)</source>
| The "r" constraint : <syntaxhighlight lang="c">"movl %%es, %0" : "=r" (x)</syntaxhighlight>
| you want x to go through a register. If x wasn't optimized as a register, the compiler will then move it to the place it should be. This means that <code>"movl %0, %%es" : : "r" (0x38)</code> is enough to load a segment register.
| you want x to go through a register. If x wasn't optimized as a register, the compiler will then move it to the place it should be. This means that <code>"movl %0, %%es" : : "r" (0x38)</code> is enough to load a segment register.
|-
|-
| The "N" constraint : <source lang="c">"outl %0, %1" : : "a" (0xFE), "N" (0x21)</source>
| The "N" constraint : <syntaxhighlight lang="c">"outl %0, %1" : : "a" (0xFE), "N" (0x21)</syntaxhighlight>
| tells the value '0x21' can be used as a constant in the out or in operation if ranging from 0 to 255
| tells the value '0x21' can be used as a constant in the out or in operation if ranging from 0 to 255
|}
|}
There are of course a lot more constraints you can put on the operand selection, machine-dependent or not, which are listed in GCC's manual (see [http://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Simple-Constraints.html#Simple-Constraints], [http://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Modifiers.html#Modifiers], [http://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Multi_002dAlternative.html#Multi_002dAlternative], and [http://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Machine-Constraints.html#Machine-Constraints]).
There are of course a lot more constraints you can put on the operand selection, machine-dependent or not, which are listed in GCC's manual (see [http://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Simple-Constraints.html#Simple-Constraints], [http://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Modifiers.html#Modifiers], [http://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Multi_002dAlternative.html#Multi_002dAlternative], and [http://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Machine-Constraints.html#Machine-Constraints]).


==Using C99==
== Using C99 ==
When using <tt>gcc -std=c99</tt> the <tt>asm</tt> keyword might not work directly. Instead use <tt>__asm__</tt>.


<tt>asm</tt> is not a keyword when using <tt>gcc -std=c99</tt>. Simply use <tt>gcc -std=gnu99</tt> to use C99 with GNU extensions. Alternatively, you can use <tt>__asm__</tt> as an alternate keyword that works even when the compiler strictly adheres to the standard.
==Assigning Labels==

== Assigning Labels ==
It is possible to assign so-called ASM labels to C/C++ keywords. You can do this by using the <tt>asm</tt> command on variable definitions, as seen in this example:
It is possible to assign so-called ASM labels to C/C++ keywords. You can do this by using the <tt>asm</tt> command on variable definitions, as seen in this example:


<source lang="c">
<syntaxhighlight lang="c">
int some_obscure_name asm("param") = 5; // "param" will be accessible in inline Assembly.
int some_obscure_name asm("param") = 5; // "param" will be accessible in inline Assembly.


Line 139: Line 141:
asm("mov param, %%eax");
asm("mov param, %%eax");
}
}
</syntaxhighlight>
</source>


Here's an example of how you can access these variables if you don't explicitly state a name:
Here's an example of how you can access these variables if you don't explicitly state a name:


<source lang="c">
<syntaxhighlight lang="c">
int some_obscure_name = 5;
int some_obscure_name = 5;


Line 150: Line 152:
asm("mov some_obscure_name, %%eax");
asm("mov some_obscure_name, %%eax");
}
}
</syntaxhighlight>
</source>


Note that you might also be obliged to use '''_some_obscure_name''' (with a leading underscore), depending on your linkage options.
Note that you might also be obliged to use '''_some_obscure_name''' (with a leading underscore), depending on your linkage options.


==asm goto==
== asm goto ==
Before gcc 4.5, jumping across inline assembly blocks is not supported. The compiler has no way of keeping track of what's going on,
Before GCC 4.5, jumping across inline assembly blocks is not supported. The compiler has no way of keeping track of what's going on,
so incorrect code is almost guaranteed to be generated.
so incorrect code is almost guaranteed to be generated.
<br>You might have been told that "gotos are evil". If you believe that is so, then asm gotos are your worst nightmare coming true.
<br>You might have been told that "gotos are evil". If you believe that is so, then asm gotos are your worst nightmare coming true.
Line 161: Line 163:


asm goto's are not well documented, but their syntax is as follows:
asm goto's are not well documented, but their syntax is as follows:
<source lang="c">
<syntaxhighlight lang="c">
asm goto( "jmp %l[labelname]" : /* no outputs */ : /* inputs */ : "memory" /* clobbers */ : labelname /* any labels used */ );
asm goto( "jmp %l[labelname]" : /* no outputs */ : /* inputs */ : "memory" /* clobbers */ : labelname /* any labels used */ );
</syntaxhighlight>
</source>


One example where this can be useful, is the CMPXCHG instruction (see [http://en.wikipedia.org/wiki/Compare-and-swap Compare and Swap]), which the Linux kernel source code defines as follows:
One example where this can be useful, is the CMPXCHG instruction (see [http://en.wikipedia.org/wiki/Compare-and-swap Compare and Swap]), which the Linux kernel source code defines as follows:
<source lang="c">
<syntaxhighlight lang="c">
/* TODO: You should use modern GCC atomic instruction builtins instead of this. */
#include <stdint.h>
#define cmpxchg( ptr, _old, _new ) { \
#define cmpxchg( ptr, _old, _new ) { \
volatile u32 *__ptr = (volatile u32 *)(ptr); \
volatile uint32_t *__ptr = (volatile uint32_t *)(ptr); \
u32 __ret; \
uint32_t __ret; \
asm volatile( "lock; cmpxchgl %2,%1" \
asm volatile( "lock; cmpxchgl %2,%1" \
: "=a" (__ret), "+m" (*__ptr) \
: "=a" (__ret), "+m" (*__ptr) \
Line 177: Line 181:
__ret; \
__ret; \
}
}
</syntaxhighlight>
</source>


In addition to returning the current value in EAX, CMPXCHG sets the zero flag (Z) when successful. Without asm gotos, your code will have to check the returned value;
In addition to returning the current value in EAX, CMPXCHG sets the zero flag (Z) when successful. Without asm gotos, your code will have to check the returned value;
this CMP instruction can be avoided as follows:
this CMP instruction can be avoided as follows:


<source lang="c">
<syntaxhighlight lang="c">
/* TODO: You should use modern GCC atomic instruction builtins instead of this. */
// Works for both 32 and 64 bit
#include <stdint.h>
#define cmpxchg( ptr, _old, _new, fail_label ) { \
#define cmpxchg( ptr, _old, _new, fail_label ) { \
volatile u32 *__ptr = (volatile u32 *)(ptr); \
volatile uint32_t *__ptr = (volatile uint32_t *)(ptr); \
u32 __ret; \
asm goto( "lock; cmpxchg %1,%0 \t\n" \
asm volatile goto( "lock; cmpxchgl %2,%1 \t\n" \
"jnz %l[" #fail_label "] \t\n" \
"jnz %l[fail_label] \t\n" \
: /* empty */ \
: "=a" (__ret), "+m" (*__ptr) \
: "m" (*__ptr), "r" (_new), "a" (_old) \
: "r" (_new), "0" (_old) \
: "memory", "cc" \
: "memory" \
: fail_label ); \
: fail_label ); \
); \
__ret; \
}
}
</syntaxhighlight>
</source>


This new macro could then be used as follows:
This new macro could then be used as follows:
<source lang="c">
<syntaxhighlight lang="c">

struct Item {
volatile struct Item* next;
};


volatile Item *head;
volatile struct Item *head;


void addItem( Item *i ) {
void addItem( struct Item *i ) {
volatile struct Item *oldHead;
again:
again:
Item *oldHead = head;
oldHead = head;
i->next = oldHead;
i->next = oldHead;
cmpxchg( &tail, oldHead, i, again );
cmpxchg( &head, oldHead, i, again );
}
}


</syntaxhighlight>
</source>


==Intel Syntax==
== Intel Syntax ==
You can let GCC use intel syntax by enabling it in inline Assembly, like so:
You can let GCC use intel syntax by enabling it in inline Assembly, like so:


<source lang="c">
<syntaxhighlight lang="c">
asm(".intel_syntax noprefix");
asm(".intel_syntax noprefix");
asm("mov eax, ebx");
asm("mov eax, ebx");
</syntaxhighlight>
</source>


Similarly, you can switch back to AT&T syntax by using the following snippet:
Similarly, you can switch back to AT&T syntax by using the following snippet:


<source lang="c">
<syntaxhighlight lang="c">
asm(".att_syntax prefix");
asm(".att_syntax prefix");
asm("mov %ebx, %eax");
asm("mov %ebx, %eax");
</syntaxhighlight>
</source>


This way you can combine Intel syntax and AT&T syntax inline Assembly. Note that once you trigger one of these syntax types, everything below the command in the source file will be assembled using this syntax, so don't forget to switch back when necessary, or you might get lots of compile errors!
This way you can combine Intel syntax and AT&T syntax inline Assembly. Note that once you trigger one of these syntax types, everything below the command in the source file will be assembled using this syntax, so don't forget to switch back when necessary, or you might get lots of compile errors!
Line 230: Line 239:
There is also a command-line option <tt>-masm=intel</tt> to globally trigger Intel syntax.
There is also a command-line option <tt>-masm=intel</tt> to globally trigger Intel syntax.


==See Also==
== See Also ==

===Articles===
=== Articles ===
* [[Inline Assembly/Examples]] - useful and commonly used functions
* [[Inline Assembly/Examples]] - useful and commonly used functions


===Forums===
=== Forum Threads ===
* [http://forum.osdev.org/viewtopic.php?f=11&t=24168&p=196655&hilit=asm+volatile+moved asm volatile being moved]
* [http://forum.osdev.org/viewtopic.php?f=11&t=24168&p=196655&hilit=asm+volatile+moved asm volatile being moved]


===External===
=== External ===
* [http://gcc.gnu.org/onlinedocs/ GCC Manuals]
* [http://gcc.gnu.org/onlinedocs/ GCC Manuals]
* [http://www-106.ibm.com/developerworks/library/l-ia.html Inline assembly for x86 in Linux (by IBM)]
* [http://web.archive.org/web/20041210030000/http://www-106.ibm.com/developerworks/library/l-ia.html Inline assembly for x86 in Linux (by IBM)]
* [http://msdn.microsoft.com/en-us/library/26td21ds(VS.80).aspx Visual C++ Compiler Intrinsics]


[[Category:Assembly]]
[[Category:Assembly]]
[[de:Inline-Assembler_mit_GCC]]

Latest revision as of 05:22, 9 June 2024

The idea behind Inline Assembly is to embed assembler instructions in your C/C++ code, using the asm keyword, when there's no option but to use Assembly language.

Overview

Sometimes, even though C/C++ is your language of choice, you need to use some assembler code in your operating system. Be it because of extreme optimization needs or because the code you're implementing is highly hardware-specific (like, say, outputting data through a port), the result is the same: there's no way around it. You must use assembly.

One of the options you have is writing an assembly function and calling it, however there can be times when even the "call" overhead is too much for you. In that case, what you need is inline assembly, which means inserting arbitrary assembly snippets in the middle of your code, using the asm() keyword. The way this keyword works is compiler-specific. This article describes the way it works in GCC since it is by far the most used compiler in the OS world.

Syntax

This is the syntax for using the asm() keyword in your C/C++ code:

asm ( assembler template
    : output operands                   (optional)
    : input operands                    (optional)
    : clobbered registers list          (optional)
    );

Assembler template is basically GAS-compatible code, except when you have constraints, in which case register names must start with %% instead of %. This means that the following two lines of code will both move the contents of the eax register into ebx:

asm ("movl %eax, %ebx");
asm ("movl %%eax, %%ebx" : );

Now, you may wonder why this %% comes in. This is where an interesting feature of inline assembly comes in: you can make use of some of your C variables in your assembly code. And since, in order to make implementation of this mechanism simpler, GCC names these variables %0, %1, and so on in your assembly code, starting from the first variable mentioned in the input/output operand sections. You're required to use this %% syntax in order to help GCC differentiate between registers and parameters.

How exactly operands work will be explained in more details in later sections. For now, it is sufficient to say that if you write something like that:

int a=10, b;
asm ("movl %1, %%eax; 
      movl %%eax, %0;"
     :"=r"(b)        /* output */
     :"r"(a)         /* input */
     :"%eax"         /* clobbered register */
     );

then you've managed to copy the value of "a" into "b" using assembly code, effectively using some C variables in your assembly code. Congratulations!

The last "clobbered register" section is used in order to tell GCC that your code is using some of the processor's registers, and that it should move any active data from the running program out of this register before executing the asm snippet. In the example above, we move a to eax in the first instruction, effectively erasing its content, so we need to ask GCC to clear this register from unsaved data before operation.

Assembler Template

The Assembler Template defines the assembler instructions to inline. The default is to use AT&T syntax here. If you want to use Intel syntax, -masm=intel should be specified as a command-line option.

As an example, to halt the CPU, you just have to use the following command:

asm( "hlt" );

Output Operands

The Output Operands section is used in order to tell the compiler / assembler how it should handle C variables used to store some output from the ASM code. The Output Operands are a list of pairs, each operand consisting of a string literal, known as "constraint", stating where the C variable should be mapped (registers are generally used for optimal performance), and a C variable to map to (in parentheses).

In the constraint, 'a' refers to EAX, 'b' to EBX, 'c' to ECX, 'd' to EDX, 'S' to ESI, and 'D' to EDI (read the GCC manual for a full list), assuming that you are coding for the IA32 architecture. An equation sign indicates that your assembly code does not care about the initial value of the mapped variable (which allows some optimization). With all that in mind, it's now pretty clear that the following code sets EAX = 0.

int EAX;
asm( "movl $0, %0"
   : "=a" (EAX)
    );

Notice that the compiler enumerates the operand starting with %0, and that you don't have to add a register to the clobbered register list if it's used to store an output operand. GCC is smart enough to figure out what to do all by itself.

Starting with GCC 3.1, you can use more readable labels instead of the error-prone enumeration:

int current_task;
asm( "str %[output]"
   : [output] "=r" (current_task)
    );

These labels are in a namespace of their own, and will not collide with any C identifiers. The same can be done for input operands, too.

Input Operands

While the Output Operands are generally used for... well... output, the Input Operands allows to parametrize the ASM code; i.e., passing read-only parameters from C code to ASM block. Again, string literals are used to specify the details.

If you want to move some value to EAX, you can do it the following way (even though it would certainly be pretty useless to do so instead of directly mapping the value to EAX):

int randomness = 4;
asm( "movl %0, %%eax"
   :
   : "b" (randomness)
   : "eax"
    );

Note that GCC will always assume that input operands are read-only (unchanged). The correct thing to do when input operands are written to is to list them as outputs, but without using the equation sign because this time their original value matters. Here is a simple example:

asm("mov %%eax,%%ebx": : "a" (amount));//useless but it gets the idea

Eax will contain "amount" and be moved into ebx.

Clobbered Registers List

It is important to remember one thing: The C/C++ compiler knows nothing about Assembler. For the compiler, the asm statement is opaque, and if you did not specify any output, it might even come to the conclusion that it's a no-op and optimize it away. Some third-party docs indicate that using asm volatile will cause the keyword to not be moved. However, according to the GCC documentation, The volatile keyword indicates that the instruction has important side-effects. GCC will not delete a volatile asm if it is reachable., which only indicates that it will not be deleted (i.e. whether it may still be moved is an unanswered question). An approach that should work is to use asm (volatile) and put memory in the clobber registers, like so:

__asm__("cli": : :"memory"); // Will cause the statement not to be moved, but it may be optimized away.
__asm__ __volatile__("cli": : :"memory"); // Will cause the statement not to be moved nor optimized away.

Since the compiler uses CPU registers for internal optimization of your C/C++ variables, and doesn't know about ASM opcodes, you have to warn it about any registers that might get clobbered as a side effect, so the compiler can save their contents before making your ASM call.

The Clobbered Registers List is a comma-separated list of register names, as string literals.

Wildcards: How you can let the compiler choose

You don't need to tell the compiler which specific register it should use in each operation, and in general, except you have good reasons to prefer one register specifically, you should better let the compiler decide for you.

Forcing to use EAX over any other register, for instance, may force the compiler to issue code that will save what was previously in eax in some other register or may introduce unwanted dependencies between operations (pipeline optimization broken)

The 'wildcards' constraints allows you to give more freedom to GCC when it comes to input/output mapping:

The "g" constraint :
"movl $0, %0" : "=g" (x)
x can be whatever the compiler prefers: a register, a memory reference. It could even be a literal constant in another context.
The "r" constraint :
"movl %%es, %0" : "=r" (x)
you want x to go through a register. If x wasn't optimized as a register, the compiler will then move it to the place it should be. This means that "movl %0, %%es" : : "r" (0x38) is enough to load a segment register.
The "N" constraint :
"outl %0, %1" : : "a" (0xFE), "N" (0x21)
tells the value '0x21' can be used as a constant in the out or in operation if ranging from 0 to 255

There are of course a lot more constraints you can put on the operand selection, machine-dependent or not, which are listed in GCC's manual (see [1], [2], [3], and [4]).

Using C99

asm is not a keyword when using gcc -std=c99. Simply use gcc -std=gnu99 to use C99 with GNU extensions. Alternatively, you can use __asm__ as an alternate keyword that works even when the compiler strictly adheres to the standard.

Assigning Labels

It is possible to assign so-called ASM labels to C/C++ keywords. You can do this by using the asm command on variable definitions, as seen in this example:

int some_obscure_name asm("param") = 5; // "param" will be accessible in inline Assembly.

void foo()
{
   asm("mov param, %%eax");
}

Here's an example of how you can access these variables if you don't explicitly state a name:

int some_obscure_name = 5;

void foo()
{
   asm("mov some_obscure_name, %%eax");
}

Note that you might also be obliged to use _some_obscure_name (with a leading underscore), depending on your linkage options.

asm goto

Before GCC 4.5, jumping across inline assembly blocks is not supported. The compiler has no way of keeping track of what's going on, so incorrect code is almost guaranteed to be generated.
You might have been told that "gotos are evil". If you believe that is so, then asm gotos are your worst nightmare coming true. However, they do offer some interesting code optimization options.

asm goto's are not well documented, but their syntax is as follows:

 asm goto( "jmp %l[labelname]" : /* no outputs */ : /* inputs */ : "memory" /* clobbers */ : labelname /* any labels used */ );

One example where this can be useful, is the CMPXCHG instruction (see Compare and Swap), which the Linux kernel source code defines as follows:

/* TODO: You should use modern GCC atomic instruction builtins instead of this. */
#include <stdint.h>
#define cmpxchg( ptr, _old, _new ) { \
  volatile uint32_t *__ptr = (volatile uint32_t *)(ptr);   \
  uint32_t __ret;                                     \
  asm volatile( "lock; cmpxchgl %2,%1"           \
    : "=a" (__ret), "+m" (*__ptr)                \
    : "r" (_new), "0" (_old)                     \
    : "memory");				 \
  );                                             \
  __ret;                                         \
}

In addition to returning the current value in EAX, CMPXCHG sets the zero flag (Z) when successful. Without asm gotos, your code will have to check the returned value; this CMP instruction can be avoided as follows:

/* TODO: You should use modern GCC atomic instruction builtins instead of this. */
// Works for both 32 and 64 bit
#include <stdint.h>
#define cmpxchg( ptr, _old, _new, fail_label ) { \
  volatile uint32_t *__ptr = (volatile uint32_t *)(ptr);   \
  asm goto( "lock; cmpxchg %1,%0 \t\n"           \
    "jnz %l[" #fail_label "] \t\n"               \
    : /* empty */                                \
    : "m" (*__ptr), "r" (_new), "a" (_old)       \
    : "memory", "cc"                             \
    : fail_label );                              \
}

This new macro could then be used as follows:

struct Item {
  volatile struct Item* next;
};

volatile struct Item *head;

void addItem( struct Item *i ) {
  volatile struct Item *oldHead;
again:
  oldHead = head;
  i->next = oldHead;
  cmpxchg( &head, oldHead, i, again );
}

Intel Syntax

You can let GCC use intel syntax by enabling it in inline Assembly, like so:

asm(".intel_syntax noprefix");
asm("mov eax, ebx");

Similarly, you can switch back to AT&T syntax by using the following snippet:

asm(".att_syntax prefix");
asm("mov %ebx, %eax");

This way you can combine Intel syntax and AT&T syntax inline Assembly. Note that once you trigger one of these syntax types, everything below the command in the source file will be assembled using this syntax, so don't forget to switch back when necessary, or you might get lots of compile errors!

There is also a command-line option -masm=intel to globally trigger Intel syntax.

See Also

Articles

Forum Threads

External