Timekeeping in virtual machines: Difference between revisions

From OSDev.wiki
Jump to navigation Jump to search
[unchecked revision][unchecked revision]
Content added Content deleted
m (Bot: Replace deprecated source tag with syntaxhighlight)
m (Bot: fixing lint errors, replacing obsolete HTML tags:)
 
Line 1: Line 1:
{{In Progress}}
{{In Progress}}

There are several ways to keep track of time in a VM, but they're either very slow (e.g. HPET) or do not work correctly if the VM is migrated (e.g. TSC).
There are several ways to keep track of time in a VM, but they're either very slow (e.g. HPET) or do not work correctly if the VM is migrated (e.g. TSC).


To work around this, VMs such as QEMU/KVM provide several ways keep track of time whilst sacrificing little performance.
To work around this, VMs such as QEMU/KVM provide several ways keep track of time whilst sacrificing little performance.

== KVM_HC_CLOCK_PAIRING ==
== KVM_HC_CLOCK_PAIRING ==

This hypercall is used to get the parameters to calculate a host's clock (KVM_CLOCK_PAIRING_WALLCLOCK for CLOCK_REALTIME).
This hypercall is used to get the parameters to calculate a host's clock (KVM_CLOCK_PAIRING_WALLCLOCK for CLOCK_REALTIME).


The host copies the following structure to a physical address given by the guest:
The host copies the following structure to a physical address given by the guest:

<syntaxhighlight lang="c">
<syntaxhighlight lang="c">
struct kvm_clock_pairing {
struct kvm_clock_pairing {
Line 20: Line 16:
};
};
</syntaxhighlight>
</syntaxhighlight>

A hypercall is performed with the `vmcall` instruction. On KVM, RBX, RCX, RDX and RSI are used for arguments, RAX as the hypercall number and as the return value. No other registers are clobbered (unless explicitly noted).
A hypercall is performed with the `vmcall` instruction. On KVM, RBX, RCX, RDX and RSI are used for arguments, RAX as the hypercall number and as the return value. No other registers are clobbered (unless explicitly noted).


For example, calling KVM_HC_CLOCK_PAIRING can be done as follows on x86_64:
For example, calling KVM_HC_CLOCK_PAIRING can be done as follows on x86_64:

<syntaxhighlight lang="asm">
<syntaxhighlight lang="asm">
; rdi: physical address to copy structure to
; rdi: physical address to copy structure to
Line 35: Line 29:
ret
ret
</syntaxhighlight>
</syntaxhighlight>

== pvclock ==
== pvclock ==

pvclock is a simple protocol and the fastest way to properly track system time in a VM.
pvclock is a simple protocol and the fastest way to properly track system time in a VM.


Line 44: Line 36:


The host will write the following structure to this address:
The host will write the following structure to this address:

<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
struct pvclock_vcpu_time_info {
struct pvclock_vcpu_time_info {
Line 57: Line 48:
};
};
</syntaxhighlight>
</syntaxhighlight>

The host will automatically update this structure when necessary (e.g. when finishing a migration).
The host will automatically update this structure when necessary (e.g. when finishing a migration).


The system time <b>in nanoseconds</b> is calculated as such:
The system time <b>in nanoseconds</b> is calculated as such:

<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
time = rdtsc() - tsc_timestamp
time = rdtsc() - tsc_timestamp
Line 71: Line 60:
time = time + system_time
time = time + system_time
</syntaxhighlight>
</syntaxhighlight>

The version field is used to detect when the structure has been / is being updated.
The version field is used to detect when the structure has been / is being updated.
If the version is odd an update is in progress and the guest must not read the other fields yet.
If the version is odd an update is in progress and the guest must not read the other fields yet.
== Hyper-V TSC page ==

== Hyper-V TSC page ==

<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
struct ms_hyperv_tsc_page {
struct ms_hyperv_tsc_page {
Line 86: Line 72:
};
};
</syntaxhighlight>
</syntaxhighlight>

== See also ==
== See also ==

=== References ===
=== References ===

* [https://docs.kernel.org/virt/kvm/x86/hypercalls.html Linux KVM Hypercall]
* [https://docs.kernel.org/virt/kvm/x86/hypercalls.html Linux KVM Hypercall]
* [https://docs.kernel.org/virt/kvm/x86/msr.html KVM-specific MSRs]
* [https://docs.kernel.org/virt/kvm/x86/msr.html KVM-specific MSRs]
* [https://opensource.com/article/17/6/timekeeping-linux-vms An introduction to timekeeping in Linux VMs]
* [https://opensource.com/article/17/6/timekeeping-linux-vms An introduction to timekeeping in Linux VMs]

[[Category:Time]]
[[Category:Time]]
[[Category:Virtual]]
[[Category:Virtual]]

Latest revision as of 15:46, 9 June 2024

This page is a work in progress.
This page may thus be incomplete. Its content may be changed in the near future.

There are several ways to keep track of time in a VM, but they're either very slow (e.g. HPET) or do not work correctly if the VM is migrated (e.g. TSC).

To work around this, VMs such as QEMU/KVM provide several ways keep track of time whilst sacrificing little performance.

KVM_HC_CLOCK_PAIRING

This hypercall is used to get the parameters to calculate a host's clock (KVM_CLOCK_PAIRING_WALLCLOCK for CLOCK_REALTIME).

The host copies the following structure to a physical address given by the guest:

struct kvm_clock_pairing {
    s64 sec;
    s64 nsec;
    u64 tsc;
    u32 flags;
    u32 pad[9];
};

A hypercall is performed with the `vmcall` instruction. On KVM, RBX, RCX, RDX and RSI are used for arguments, RAX as the hypercall number and as the return value. No other registers are clobbered (unless explicitly noted).

For example, calling KVM_HC_CLOCK_PAIRING can be done as follows on x86_64:

; rdi: physical address to copy structure to
; rsi: clock type (KVM_CLOCK_PAIRING_WALLCLOCK = 0)
kvm_hc_clock_pairing:
    mov eax, 9   ; KVM_HC_CLOCK_PAIRING
    mov rbx, rdi
    mov rcx, rsi
    vmcall
    ret

pvclock

pvclock is a simple protocol and the fastest way to properly track system time in a VM.

To use it, write a 64-bit 4-byte aligned physical address with bit 0 set to 1 to MSR_KVM_SYSTEM_TIME_NEW (0x4b564d01). The presence of this MSR is indicated by bit 3 in EAX from leaf 0x4000001 of CPUID.

The host will write the following structure to this address:

struct pvclock_vcpu_time_info {
    u32 version;
    u32 pad0;
    u64 tsc_timestamp;
    u64 system_time;
    u32 tsc_to_system_mul;
    s8 tsc_shift;
    u8 flags;
    u8 pad[2];
};

The host will automatically update this structure when necessary (e.g. when finishing a migration).

The system time in nanoseconds is calculated as such:

time = rdtsc() - tsc_timestamp
if (tsc_shift >= 0)
    time <<= tsc_shift;
else
    time >>= -tsc_shift;
time = (time * tsc_to_system_mul) >> 32
time = time + system_time

The version field is used to detect when the structure has been / is being updated. If the version is odd an update is in progress and the guest must not read the other fields yet.

Hyper-V TSC page

struct ms_hyperv_tsc_page {
    volatile u32 tsc_sequence;
    u32 reserved1;
    volatile u64 tsc_scale;
    volatile s64 tsc_offset;
    u64 reserved2[509];
};

See also

References