Inline Assembly: Difference between revisions

m
Bot: Replace deprecated source tag with syntaxhighlight
[unchecked revision][unchecked revision]
m (Bot: Replace deprecated source tag with syntaxhighlight)
Line 9:
This is the syntax for using the <tt>asm()</tt> keyword in your C/C++ code:
 
<sourcesyntaxhighlight lang="c">
asm ( assembler template
: output operands (optional)
Line 15:
: clobbered registers list (optional)
);
</syntaxhighlight>
</source>
 
Assembler template is basically [[GAS]]-compatible code, except when you have constraints, in which case register names must start with %% instead of %. This means that the following two lines of code will both move the contents of the <tt>eax</tt> register into <tt>ebx</tt>:
 
<sourcesyntaxhighlight lang="c">
asm ("movl %eax, %ebx");
asm ("movl %%eax, %%ebx" : );
</syntaxhighlight>
</source>
 
Now, you may wonder why this %% comes in. This is where an interesting feature of inline assembly comes in: you can make use of some of your C variables in your assembly code. And since, in order to make implementation of this mechanism simpler, GCC names these variables %0, %1, and so on in your assembly code, starting from the first variable mentioned in the input/output operand sections. You're required to use this %% syntax in order to help GCC differentiate between registers and parameters.
Line 28:
How exactly operands work will be explained in more details in later sections. For now, it is sufficient to say that if you write something like that:
 
<sourcesyntaxhighlight lang="c">
int a=10, b;
asm ("movl %1, %%eax;
Line 36:
:"%eax" /* clobbered register */
);
</syntaxhighlight>
</source>
 
then you've managed to copy the value of "a" into "b" using assembly code, effectively using some C variables in your assembly code. Congratulations!
Line 47:
As an example, to halt the CPU, you just have to use the following command:
 
<sourcesyntaxhighlight lang="c">
asm( "hlt" );
</syntaxhighlight>
</source>
 
=== Output Operands ===
Line 56:
In the constraint, 'a' refers to EAX, 'b' to EBX, 'c' to ECX, 'd' to EDX, 'S' to ESI, and 'D' to EDI (read the GCC manual for a full list), assuming that you are coding for the IA32 architecture. An equation sign indicates that your assembly code does not care about the initial value of the mapped variable (which allows some optimization). With all that in mind, it's now pretty clear that the following code sets EAX = 0.
 
<sourcesyntaxhighlight lang="c">
int EAX;
asm( "movl $0, %0"
: "=a" (EAX)
);
</syntaxhighlight>
</source>
 
Notice that the compiler enumerates the operand starting with %0, and that you don't have to add a register to the clobbered register list if it's used to store an output operand. GCC is smart enough to figure out what to do all by itself.
Line 67:
Starting with GCC 3.1, you can use more readable labels instead of the error-prone enumeration:
 
<sourcesyntaxhighlight lang="c">
int current_task;
asm( "str %[output]"
: [output] "=r" (current_task)
);
</syntaxhighlight>
</source>
 
These labels are in a namespace of their own, and will not collide with any C identifiers. The same can be done for input operands, too.
Line 81:
If you want to move some value to EAX, you can do it the following way (even though it would certainly be pretty useless to do so instead of directly mapping the value to EAX):
 
<sourcesyntaxhighlight lang="c">
int randomness = 4;
asm( "movl %0, %%eax"
Line 88:
: "eax"
);
</syntaxhighlight>
</source>
 
Note that GCC will always assume that input operands are read-only (unchanged). The correct thing to do when input operands are written to is to list them as outputs, but without using the equation sign because this time their original value matters. Here is a simple example:
<sourcesyntaxhighlight lang="c">
asm("mov %%eax,%%ebx": : "a" (amount));//useless but it gets the idea
</syntaxhighlight>
</source>
Eax will contain "amount" and be moved into ebx.
 
Line 99:
It is important to remember one thing: ''The C/C++ compiler knows nothing about Assembler''. For the compiler, the asm statement is opaque, and if you did not specify any output, it might even come to the conclusion that it's a no-op and optimize it away. Some third-party docs indicate that using asm volatile will cause the keyword to not be moved. However, according to the GCC documentation, ''The volatile keyword indicates that the instruction has important side-effects. GCC will not delete a volatile asm if it is reachable.'', which only indicates that it will not be deleted (i.e. whether it may still be moved is an unanswered question). An approach that should work is to use asm (volatile) and put '''memory''' in the clobber registers, like so:
 
<sourcesyntaxhighlight lang="c">
__asm__("cli": : :"memory"); // Will cause the statement not to be moved, but it may be optimized away.
__asm__ __volatile__("cli": : :"memory"); // Will cause the statement not to be moved nor optimized away.
</syntaxhighlight>
</source>
 
Since the compiler uses CPU registers for internal optimization of your C/C++ variables, and doesn't know about ASM opcodes, you have to warn it about any registers that might get clobbered as a side effect, so the compiler can save their contents before making your ASM call.
Line 116:
{| {{wikitable}}
|-
| The "g" constraint : <sourcesyntaxhighlight lang="c">"movl $0, %0" : "=g" (x)</sourcesyntaxhighlight>
| x can be whatever the compiler prefers: a register, a memory reference. It could even be a literal constant in another context.
|-
| The "r" constraint : <sourcesyntaxhighlight lang="c">"movl %%es, %0" : "=r" (x)</sourcesyntaxhighlight>
| you want x to go through a register. If x wasn't optimized as a register, the compiler will then move it to the place it should be. This means that <code>"movl %0, %%es" : : "r" (0x38)</code> is enough to load a segment register.
|-
| The "N" constraint : <sourcesyntaxhighlight lang="c">"outl %0, %1" : : "a" (0xFE), "N" (0x21)</sourcesyntaxhighlight>
| tells the value '0x21' can be used as a constant in the out or in operation if ranging from 0 to 255
|}
Line 134:
It is possible to assign so-called ASM labels to C/C++ keywords. You can do this by using the <tt>asm</tt> command on variable definitions, as seen in this example:
 
<sourcesyntaxhighlight lang="c">
int some_obscure_name asm("param") = 5; // "param" will be accessible in inline Assembly.
 
Line 141:
asm("mov param, %%eax");
}
</syntaxhighlight>
</source>
 
Here's an example of how you can access these variables if you don't explicitly state a name:
 
<sourcesyntaxhighlight lang="c">
int some_obscure_name = 5;
 
Line 152:
asm("mov some_obscure_name, %%eax");
}
</syntaxhighlight>
</source>
 
Note that you might also be obliged to use '''_some_obscure_name''' (with a leading underscore), depending on your linkage options.
Line 163:
 
asm goto's are not well documented, but their syntax is as follows:
<sourcesyntaxhighlight lang="c">
asm goto( "jmp %l[labelname]" : /* no outputs */ : /* inputs */ : "memory" /* clobbers */ : labelname /* any labels used */ );
</syntaxhighlight>
</source>
 
One example where this can be useful, is the CMPXCHG instruction (see [http://en.wikipedia.org/wiki/Compare-and-swap Compare and Swap]), which the Linux kernel source code defines as follows:
Line 202:
 
This new macro could then be used as follows:
<sourcesyntaxhighlight lang="c">
 
struct Item {
Line 218:
}
 
</syntaxhighlight>
</source>
 
== Intel Syntax ==
You can let GCC use intel syntax by enabling it in inline Assembly, like so:
 
<sourcesyntaxhighlight lang="c">
asm(".intel_syntax noprefix");
asm("mov eax, ebx");
</syntaxhighlight>
</source>
 
Similarly, you can switch back to AT&T syntax by using the following snippet:
 
<sourcesyntaxhighlight lang="c">
asm(".att_syntax prefix");
asm("mov %ebx, %eax");
</syntaxhighlight>
</source>
 
This way you can combine Intel syntax and AT&T syntax inline Assembly. Note that once you trigger one of these syntax types, everything below the command in the source file will be assembled using this syntax, so don't forget to switch back when necessary, or you might get lots of compile errors!