VMX: Difference between revisions

3,901 bytes added ,  29 days ago
m
Bot: Replace deprecated source tag with syntaxhighlight
[unchecked revision][unchecked revision]
mNo edit summary
m (Bot: Replace deprecated source tag with syntaxhighlight)
 
(9 intermediate revisions by 6 users not shown)
Line 6:
 
=== Discovering support ===
After enumerating the features from [[CPUID]], bit 5 in ECX will tell you if VMX is supported or not. This is the first thing that should be checked!
 
=== Basic environment ===
Software should enable bit 13 of CR4 (CR4.VMXE) first. Here are a few definitions of some used MSR's that we'll be dealing with:
<sourcesyntaxhighlight lang="c">
#define IA32_VMX_BASIC 0x480
#define IA32_VMX_CR0_FIXED0 0x486
Line 16:
#define IA32_VMX_CR4_FIXED0 0x488
#define IA32_VMX_CR4_FIXED1 0x489
</syntaxhighlight>
</source>
 
In order to see what your OS needs to support, you need to read the lower 32-bits of IA32_VMX_CR0_FIXED0 and IA32_VMX_CR4_FIXED0. The bits in those MSR's are what need to be set (and supported) in their corresponding control registers (CR0/CR4). If any of those bits are not enabled, a GPF will occur. Also, to see what extra bits can be set, check the IA32_VMX_CR0_FIXED1 and IA32_VMX_CR4_FIXED1. It's basically a mask of bits, if a bit is set to 1, then it "can" be enabled, but if it's a 0, you may generate an exception if you enable the corresponding bit in the control registers.
 
From my own hardware and emulator (bochs), I've found that you will most likely need to enable these bits: CR0.NE, CR4.PAEVMXE, CR0.PG and CR0.PE. For 64-bit kernels, you need to have IA32_EFER.LMA set.
Yes, that means you need to support PAE-enabled paging and have those structures installed into CR3. For 64-bit kernels, you need to have IA32_EFER.LMA set.
 
=== Executing VMXON ===
The main entry point for using VMX is through the VMXON instruction. The instruction requires a single operand of a m64 region called the VMXON region. The memory region needs to be 4096-byte aligned (bits 0-11 must be 0) and the only VMCS field that should be modified is the VMCS revision identification field. This ID field should contain the value in bits 0-31 of MSR IA32_VMX_BASIC. In order to prepare a memory address in 32-bit PMode for use as an m64, some modifications need to be made. The upper 32-bits of the m64 on non long mode capable processors have to be 0 or an "invalid memory address" error will occur and a VMEXIT will be called.
<sourcesyntaxhighlight lang="c">
uint32_t * region = (uint32_t *)allocate_4k_aligned(4096);
uint64_t region64 = (uint64_t)((size_t)(region) & 0xFFFFFFFF);
asm volatile(" vmxon %0; "::"m" (region64));
</syntaxhighlight>
</source>
 
This general process of taking a 32-bit memory address and turning it into a psuedo-64bit int (unsigned long long) will be used for all m64 operands later. VMCLEAR is another example instruction that requires the upper 32-bits of a memory address to be 0.
 
Long mode capable processors simply requires a 64-bit pointer to the region.
 
Note: '''The VMXON, VMCLEAR and VMPTRLD instruction must point to the physical address of their respective regions.'''
Line 48 ⟶ 47:
 
If the zero flag is set, it indicates that the VMCS pointer is valid but there is some other error specified in the VM-instruction error field (encoding 4400h). Error numbers are listed in section 5.4 of the Intel SDM Volume 2B.
 
The following table represents the error numbers in the VM-instruction error field:
 
{| {{wikitable}}
|-
! Error Number !! Description
|-
| 0x01 || VMCALL executed in VMX root operation
|-
| 0x02 || VMCLEAR with invalid physical address.
|-
| 0x03 || VMCLEAR with VMXON pointer.
|-
| 0x04 || VMLAUNCH with non-clear VMCS.
|-
| 0x05 || VMRESUME with non-launched VMCS.
|-
| 0x06 || VMRESUME with a corrupted VMCS. Indicates corruption of the current VMCS
|-
| 0x07 || VM entry with invalid VMX-control field(s).
|-
| 0x08 || VM entry with invalid host-state field(s).
|-
| 0x09 || VMPTRLD with invalid physical address.
|-
| 0x0A || VMPTRLD with VMXON pointer.
|-
| 0x0B || VMPTRLD with incorrect VMCS revision identifier.
|-
| 0x0C || VMREAD/VMWRITE from/to unsupported VMCS component.
|-
| 0x0D || VMWRITE to read-only VMCS component.
|-
| 0x0F || VMXON executed in VMX root operation.
|-
| 0x1A || VM entry with events blocked by MOV SS.
|}
 
== VMCS ==
Line 58 ⟶ 94:
A VMCS is loaded with the VMPTRLD instruction, which loads and activates a VMCS, and requires a 64-bit memory address as it's operand in the same format as VMXON/VMCLEAR.
 
<sourcesyntaxhighlight lang="c">
asm volatile ("vmptrld %0; ":: "m" (vmcsRegion64));
</syntaxhighlight>
</source>
 
The structure of the VMCS is covered in detail in Chapter 20 of the Intel SDM volume 3B (see link below). Field encodings for VMWRITE and VMREAD are covered in Appendix H of the same manual.
== Peripheral Emulation ==
=== IO framework emulation ===
In x86, there are two kinds of IO channels: Port-Based IO(aka '''PIO''' ) and Memory-Mapped IO(aka '''MMIO'''). PIO has separate address space and special instructions to do IO jobs. while with MMIO, the device IO space is backed with the memory address space, you can use memory data move instructions to do IO jobs.
==== PIO emulation ====
with Intel VT-x, the hypervisor is able to determine whether the guest's IO instructions trap into vmx root mode by setting the primary processor based control bit 24. if this bit is set, all the guest's IO instructions will causes vm exits. otherwise, you have to setup the two IO bitmap regions to capture the vm exits you are interested in.
 
the IO causes vm exit with basic reason number as 30, you can retrieve the IO operation size, direction, port id and etc. for more please refer to the vmx_pio.c in reference pages.
==== MMIO emulation ====
MMIO emulation in x86 is a bit different: we are going to exploit EPT in order to capture MMIO events.
VMX provides two kinds of EPT involved vm exits: EPT violation and EPT misconfiguration. In general, when guest is accessing memory which is not backed correctly(e.g. the memory is '''writable''' but '''not readable'''!), VMX results in vm exits with EPT misconfiguration.
the hypervisor must do the following steps to do MMIO operation:
<syntaxhighlight lang="bash">
1). decode the memory move instruction to determine the memory involved instruction length, access size, direction, operations, registers index/immediates and memory address,
2). search the MMIO devices regions to see whether the address is backed with a DEVICE.
3). store the result in destination register if necessary.
4). advance to next instruction by adding guest RIP with instruction length resolved in step 1.
</syntaxhighlight>
 
=== Devices emulation examples===
{| {{wikitable}}
|-
! Device !! IO type !! Refference
|-
| Intel 8259 PIC || PIO ||https://github.com/chillancezen/ZeldaOS.x86_64/blob/master/vm_monitor/device_8259pic.c
|-
| Intel 8253 PIT || PIO ||https://github.com/chillancezen/ZeldaOS.x86_64/blob/master/vm_monitor/device_8253pit.c
|-
| Intel 8042 keyboard || PIO ||https://github.com/chillancezen/ZeldaOS.x86_64/blob/master/vm_monitor/device_keyboard.c
|-
| serial port controller || PIO ||https://github.com/chillancezen/ZeldaOS.x86_64/blob/master/vm_monitor/device_serial.c
|-
| 16-colors video controller || MMIO ||https://github.com/chillancezen/ZeldaOS.x86_64/blob/master/vm_monitor/device_video.c
|-
|}
 
== References ==
Line 72 ⟶ 142:
 
BOCHS's VMX.c (LGPLv2): http://bochs.cvs.sourceforge.net/viewvc/bochs/bochs/cpu/vmx.cc
 
PIO sub handler: https://github.com/chillancezen/ZeldaOS.x86_64/blob/master/vm_monitor/vmx_pio.c
 
Memory move instruction decode: https://github.com/chillancezen/ZeldaOS.x86_64/blob/master/vm_monitor/vmx_instruction_decoding.c
 
 
 
== Other examples ==
Vmx implementation in home made OS:
http://www.dumais.io/index.php?article=ac3267239dd3e34c061de6413203fb98
 
[[Category:X86]]
[[Category:Virtual]]