Inline Assembly: Difference between revisions

Reformatted source tags, added info about intel syntax.
[unchecked revision][unchecked revision]
m (Fixed typo.)
(Reformatted source tags, added info about intel syntax.)
Line 1:
If you are using a [[GCC]] toolchain, using '''Inline Assembly''' might be a viable option for you. The idea is to embed assembler instructions in your C/C++ code, using the asm keyword.
 
==Overview==
Sometimes, even though C/C++ is your language of choice, you '''need''' to use some asm code in your operating system. Be it because of extreme optimization needs or because the code you're implementing is highly hardware-specific (like, say, outputting data through a port), the result is the same : there's no way around it. You must use assembly.
Line 8 ⟶ 9:
This is the prototype for calling asm() in your C/C++ code:
 
<source lang="c">
<pre>
asm ( assembler template
: output operands (optional)
Line 14 ⟶ 15:
: clobbered registers list (optional)
);
</presource>
 
Assembler template is basically GAS-compatible code, except that register names now start with %% instead of %. This means that the following code...
 
<pre>
<source lang="c">
asm ("movl %%eax, %%ebx");
</presource>
 
...will move eax's content into ebx. Now, you may wonder why this %% comes in. This is where an interesting feature of inline assembly comes in : you can make use of some of your C variables in your assembly code. And since, in order to make implementation of this mechanism simpler, GCC names these variables %0, %1, and so on in your assembly code, starting from the first variable mentioned in the input/output operand sections, you're required to use this %% syntax in order to help GCC making a separation between registers and parameters...
 
How exactly operands work will be explained in more details in later sections. For now, sufficient is to say that if you write something like that...
 
<pre>
<source lang="c">
int a=10, b;
asm ("movl %1, %%eax;
Line 31 ⟶ 35:
:"%eax" /* clobbered register */
);
</presource>
 
You've managed to copy the value of "a" in "b" using assembly code, effectively using some C variables in your assembly code. Congratulations !
 
Line 39 ⟶ 44:
The Assembler Template defines the assembler instructions to inline. The default is to use AT&T syntax here. If you want to use Intel syntax, <tt>-masm=intel</tt> should be specified as a command-line option.
 
As an example, to halt the CPU, you just have to use the following command :
 
<pre>
<source lang="c">
asm( "hlt" );
</presource>
 
===Output Operands===
Line 48 ⟶ 54:
 
In the constraint, 'a' refers to EAX, 'b' to EBX, 'c' to ECX, 'd' to EDX, 'S' to ESI, and 'D' to EDI (read the GCC manual for a full list), assuming that you are coding for the IA32 architecture. An equation sign indicates that your assembly code does not care about the initial value of the mapped variable (which allows some optimization). With all that in mind, it's now pretty clear that the following code sets EAX = 0.
 
<pre>
<source lang="c">
int EAX;
asm( "movl $0, %0"
: "=a" (EAX)
);
</presource>
 
Notice that the compiler enumerates the operand starting with %0, and that you don't have to add a register to the clobbered register list if it's used to store an output operand. GCC is smart enough to figure out what to do all by itself.
 
Starting with GCC 3.1, you can use more readable labels instead of the error-prone enumeration:
 
<pre>
<source lang="c">
int current_task;
asm( "str %[output]"
: [output] "=r" (current_task)
);
</presource>
 
These labels are in a namespace of their own, and will not collide with any C identifiers. The same can be done for input operands, too.
 
Line 68 ⟶ 78:
While the Output Operands are generally used for... well... output, the Input Operands allows to parametrize the ASM code; i.e., passing read-only parameters from C code to ASM block. Again, string literals are used to specify the details.
 
If you want to move some value to EAX, you can do it the following way (even though it would certainly be pretty useless to do so instead of directly mapping the value to EAX):
 
<pre>
<source lang="c">
int randomness = 4;
asm( "movl %0, %%eax"
Line 76 ⟶ 87:
: %%eax
);
</presource>
 
Note that GCC will always assume that input operands are read-only (unchanged). The correct thing to do when input operands are written to is to list them as outputs, but without using the equation sign because this time their original value matters. Here is a simple example:
<source lang="c">
<pre>
asm("mov %%eax,%%ebx": : "a" (amount));//useless but it gets the idea
</presource>
Eax will contain "amount" and be moved into ebx.
 
Line 99 ⟶ 110:
{| {{wikitable}}
|-
| <codesource lang="c">"movl $0, %0" : "='''g'''" (x)</codesource>
| x can be whatever the compiler prefers: a register, a memory reference. It could even be a literal constant in another context.
|-
| <codesource lang="c">"movl %%es, %0" : "='''r'''" (x)</codesource>
| you want x to go through a register (this is an x86-specific constraint). If x wasn't optimized as a register, the compiler will then move it to the place it should be. This means that <code>"movl %0, %%es" : : "r" (0x38)</code> is enough to load a segment register.
|-
| <codesource lang="c">"outl %0, %1" : : "a" (0xFE), "'''N'''" (0x21)</codesource>
| tells the value '0x21' can be used as a constant in the out or in operation if ranging from 0 to 255
|}
Line 113 ⟶ 124:
When using <tt>gcc -std=c99</tt> the <tt>asm</tt> keyword might not work directly. Instead use <tt>__asm__</tt>.
 
==SeeAssigning AlsoLabels==
It is possible to assign so-called ASM labels to C/C++ keywords. You can do this by using the <tt>asm</tt> command on variable definitions, as seen in this example:
There is a lot more to be known about Inline Assembly. The stack-layout of the IA32 floating-point registers can be a real headache, as well as clobbered memory etc.; further documentation and tutorials can be found here:
 
<source lang="c">
* [http://gcc.gnu.org/onlinedocs/ GCC manual]
int some_obscure_name asm("param") = 5; // "param" will be accessible in inline Assembly.
* http://www-106.ibm.com/developerworks/library/l-ia.html
 
void foo()
{
asm("mov param, %%eax");
}
</source>
 
Here's an example of how you can access these variables if you don't explicitely state a name:
 
<source lang="c">
int some_obscure_name = 5;
 
void foo()
{
asm("mov some_obscure_name, %%eax");
}
</source>
 
Note that you might also be obliged to use '''_some_obscure_name''' (with a leading underscore), depending on your linkage options.
 
==Intel Syntax==
You can let GCC use intel syntax by enabling it in inline Assembly, like so:
 
<source lang="c">
asm(".intel_syntax noprefix");
asm("mov eax, ebx");
</source>
 
Similarly, you can switch back to AT&T syntax by using the following snippet:
 
<source lang="c">
asm(".att_syntax prefix");
asm("mov %ebx, %eax");
</source>
 
This way you can combine Intel syntax and AT&T syntax inline Assembly. Note that once you trigger one of these syntax types, everything below the command in the source file will be assembled using this syntax, so don't forget to switch back when necessary, or you might get lots of compile errors!
 
There is also a command-line option <tt>-masm=intel</tt> to globally trigger Intel syntax.
 
==See Also==
===Articles===
* [[Inline Assembly/Examples]] - useful and commonly used functions
 
===External===
* [http://gcc.gnu.org/onlinedocs/ GCC manualManuals]
* [http://www-106.ibm.com/developerworks/library/l-ia.html Inline assembly for x86 in Linux (by IBM)]
 
[[Category:Assembly]]
252

edits