X86-64 Instruction Encoding: Difference between revisions

→‎Opcode: New version
[unchecked revision][unchecked revision]
(→‎Legacy Prefixes: New version)
(→‎Opcode: New version)
Line 129:
 
== Opcode ==
The x86-64 instruction set defines many opcodes and many ways to encode them, depending on several factors.
There are 1, 2 and 3-byte opcodes. Some opcodes start with the same bytes as some prefixes, without taking the meaning of the prefix.
 
=== Legacy opcodes ===
Legacy (and x87) opcodes consist of, in this order:
* mandatory prefix;
* REX prefix;
* opcode.
 
==== Mandatory prefix ====
Certain instructions (most notably the SIMD instructions) require a mandatory prefix (0x66, 0xF2 or 0xF3), which looks like a normal modifier prefix. When a mandatory prefix is required, it is put with the modifier prefixes before the REX prefix (if any).
 
==== REX prefix ====
The REX prefix is only available in [[Long mode|long mode]].
 
===== Usage =====
A REX prefix must be encoded when:
* using 64-bit operand size and the instruction does not default to 64-bit operand size; or
* using one of the extended registers (R8 to R15, XMM8 to XMM15, YMM8 to YMM15, CR8 to CR15 and DR8 to DR15); or
* using one of the uniform byte registers SPL, BPL, SIL or DIL.
 
A REX prefix must not be encoded when:
* using one of the high byte registers AH, CH, BH or DH.
 
In all other cases, the REX prefix is ignored. The use of multiple REX prefixes is undefined, although processors seem to use only the last REX prefix.
 
Instructions that default to 64-bit operand size in long mode are:
{| {{wikitable}}
|-
| CALL (near)||ENTER||Jcc
|-
| JrCXZ||JMP (near)||LEAVE
|-
| LGDT||LIDT||LLDT
|-
| LOOP||LOOPcc||LTR
|-
| MOV CR(n)||MOV DR(n)||POP reg/mem
|-
| POP reg||POP FS||POP GS
|-
| POPFQ||PUSH imm8||PUSH imm32
|-
| PUSH reg/mem||PUSH reg||PUSH FS
|-
| PUSH GS||PUSHFQ||RET (near)
|}
 
===== Encoding =====
The layout is as follows:
<pre>
7 0
+---+---+---+---+---+---+---+---+
| 0 1 0 0 | W | R | X | B |
+---+---+---+---+---+---+---+---+
</pre>
{| {{wikitable}}
! Field
! Length
! Description
|-
| b0100||4 bits||Fixed bit pattern
|-
| W||1 bit||When 1, a 64-bit operand size is used. Otherwise, when 0, the default operand size is used (which is 32-bit for most but not all instructions, see [[#Operand-size and address-size override prefix|this table]]).
|-
| R||1 bit||This 1-bit value is an extension to the ''MODRM.reg'' field. See [[#Registers|Registers]].
|-
| X||1 bit||This 1-bit value is an extension to the ''SIB.index'' field. See [[#64-bit addressing|64-bit addressing]].
|-
| B||1 bit||This 1-bit value is an extension to the ''MODRM.rm'' field or the ''SIB.base'' field. See [[#64-bit addressing|64-bit addressing]].
|}
 
==== Opcode ====
The opcode can be 1, 2 or 3 bytes in length. Depending on the opcode escape sequence, a different opcode map is selected. Possible opcode sequences are:
* <op>
* 0x0F <op>
* 0x0F 0x38 <op>
* 0x0F 0x3A <op>
 
Note that opcodes can specify that the REG field in the ModR/M byte is fixed at a particular value.
 
=== VEX/XOP opcodes ===
A VEX/XOP prefix must be encoded when:
* the instruction has only its VEX/XOP opcode and no legacy opcode; or
* 256-bit YMM registers are used; or
* more than three operands are used (e.g. 'nondestructive-source operations'); or
* when using 128-bit XMM destination registers, bits 128-255 of the corresponding YMM register must be cleared.
 
A VEX/XOP prefix must not be encoded when:
* when using 128-bit XMM destination registers, bits 128-255 of the corresponding YMM register must not be changed.
 
 
There are many VEX and XOP instructions, all of which can be encoded using the three byte VEX/XOP escape prefix. The VEX and XOP escape prefixes use fields with the following semantics:
 
{| {{wikitable}}
! Field
! Length
! Description
|-
| VEX/XOP prefix||8 bits||Prefix.
{| {{wikitable}}
! Prefix
! Opcode map and encoding
|-
| 0xC4||Three-byte VEX
|-
| 0xC5||Two-byte VEX
|-
| 0x8F||Three-byte XOP
|}
|-
| ~R||1 bit||This 1-bit value is an 'inverted' extension to the ''MODRM.reg'' field. The inverse of REX.R. See [[#Registers|Registers]].
|-
| ~X||1 bit||This 1-bit value is an 'inverted' extension to the ''SIB.index'' field. The inverse of REX.X. See [[#64-bit addressing|64-bit addressing]].
|-
| ~B||1 bit||This 1-bit value is an 'inverted' extension to the ''MODRM.rm'' field or the ''SIB.base'' field. The inverse of REX.B. See [[#64-bit addressing|64-bit addressing]].
|-
| map_select||5 bits||Specifies the opcode map to use.
|-
| W/E||1 bit||For integer instructions: when 1, a 64-bit operand size is used; otherwise, when 0, the default operand size is used (equivalent with REX.W). For non-integer instructions, this bit is a general opcode extension bit.
|-
| ~vvvv||4 bits||An additional operand for the instruction. The value of the XMM or YMM register (see [[#Registers|Registers]]) is 'inverted'.
|-
| L||1 bit||When 0, a 128-bit vector lengh is used. Otherwise, when 1, a 256-bit vector length is used.
|-
| pp||2 bits||Specifies an implied mandatory prefix for the opcode.
{| {{wikitable}}
! Value
! Implied mandatory prefix
|-
| b00||none
|-
| b01||0x66
|-
| b10||0xF3
|-
| b11||0xF2
|}
|}
 
==== Three byte VEX escape prefix ====
The layout is as follows, starting with a byte with value 0xC4:
<pre>
7 0 7 0 7 0
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+
| 1 1 0 0 0 1 0 0 | |~R |~X |~B | map_select | |W/E| ~vvvv | L | pp |
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+
</pre>
A VEX instruction whose values for certain fields are VEX.~X == 1, VEX.~B == 1, VEX.W/E == 0 and map_select == b00001 may be encoded using the [[two byte VEX escape prefix]].
 
==== Three byte XOP escape prefix ====
The layout is the same as the [[three-byte VEX escape prefix]], but with initial byte value 0x8F:
<pre>
7 0 7 0 7 0
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+
| 1 0 0 0 1 1 1 1 | |~R |~X |~B | map_select | |W/E| ~vvvv | L | pp |
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+
</pre>
 
==== Two byte VEX escape prefix ====
A VEX instruction whose values for certain fields are VEX.~X == 1, VEX.~B == 1, VEX.W/E == 0 and map_select == b00001 may be encoded using the two byte VEX escape prefix. The layout is as follows:
<pre>
7 0 7 0
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+
| 1 1 0 0 0 1 0 1 | |~R | ~vvvv | L | pp |
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+
</pre>
 
=== 3DNow! opcodes ===
3DNow! opcodes consist of, in this order:
* fixed opcode;
* (ModR/M, SIB, displacement);
* immediate opcode byte.
 
==== Fixed opcode ====
All 3DNow! opcodes have a fixed two-byte sequence equal to 0x0F 0x0F in the opcode position of the instruction.
 
==== Immediate opcode byte ====
3DNow! instructions encode the actual opcode as an 8-bit immediate value trailing the instruction (thus after the ModR/M, SIB and displacement).
 
== ModR/M and SIB bytes ==
Anonymous user