Non Maskable Interrupt: Difference between revisions

m
no edit summary
[unchecked revision][unchecked revision]
No edit summary
mNo edit summary
Line 8:
The short version of this story is that there's only really 2 reasons for an NMI. The first reason is a hardware failure. The second reason is a "watchdog timer", which can be used to detect when the kernel itself locks up (and is sometimes also used for more accurate profiling as it allows EIP to be sampled even when IRQs are disabled).
 
If a hardware failure caused an NMI then there's no way to figure out which piece of hardware caused the NMI. In this case I'd try to do the least possible in an attempt to tell the user that a hardware failure occuredoccurred, but at the end of the day you can't expect any OS to work sanely on faulty hardware and there's nothing software can do to work around the hardware failure anyway.
 
For the watchdog timer, it must be setup by the OS first. This can actually be done even when the chipset itself doesn't have a special watchdog timer for it (e.g. setting the PIT, RTC/CMOS IRQ or a HPET IRQ to "NMI, send to all CPUs" in the I/O APIC). In this case you want the watchdog timer to be fast (i.e. no slow hardware task switching and cache flushing) and you'd also want all CPUs to share the same timer, which means all CPUs would receive the same IRQ at the same time (which brings me back to the busy flag in your TSS).
Anonymous user