FPU: Difference between revisions

417 bytes added ,  29 days ago
m
no edit summary
[unchecked revision][unchecked revision]
mNo edit summary
 
(7 intermediate revisions by 6 users not shown)
Line 1:
{{Floats}}
The x86 FPU was originally an optional addition to the processor that was able to perform floating point math in hardware, but has since been integrated into the CPU proper and has collected over the years the majority of math-heavy instructions. The modern FPU has become a legacy term for what is actually the vector processing units, which just happens to include the original floating point operations.
 
== x86 FPU Legacy ==
 
Originally, the FPU was a dedicated coprocessor chip placed on top of the actual processor. Since it was performing calculations asynchronously from the core logic, it'sits results would have been available after the main processor has executed several other instructions. Since errors would also become available asynchronously, the original PC had the error line of the FPU wired to the [[PIC|interrupt controller]]. When the 486 added multiprocessor support, it became impossible to detect which of the FPUs has raised an exception, after which they integrated the FPU on-die and added an option to signal a regular exception rather than an interrupt. To provide backwards compatibility, the 486 was given a pin to replace the original FPU error line, which would be routed to the PIC and then back into the CPU's IRQ line to simulate the original setup with a dedicated coprocessor. This has the unfortunate consequence that by default, floating point exceptions will not operate as recommended by the manual.
 
== FPU configuration ==
Line 10 ⟶ 11:
 
== Detecting an FPU ==
On x86 processors up to the 386, FPUs were external and strictly optional. They allowed the use of different floating-point units, including those which did not strictly correspond to the processor's generation. For example, the 386 was capable of operating with both a 287 (the FPU corresponding to the 286), and the 387 (the contemporary FPU). The 486 line of microprocessors was bifurcated into the 486DX, which included an on-chip floating-point unit, and the 486SX, which did not. The external 487 coprocessor was essentially a modified 486DX that disabled the installed CPU. All x86 CPUs from the Pentium onward have an integrated FPU present (excluding the [https://en.wikipedia.org/wiki/NexGen NexGen 5x86]).
On 386s, FPUs were external and strictly optional. The 486 came in an FPU-included and an FPU-less package, with the "FPU upgrade" being just a modified 486 that disabled its lesser counterpart. From the Pentium onwards, FPUs were always integrated and present. To make things more tricky, 386s were capable of operating with both a 287 (the 286's FPU), and the 387 (the intended FPU)
 
There are two ways to detect an FPU:
Line 20 ⟶ 21:
 
The common way of testing the presence of an FPU is to have it write it's status somewhere and then check if it actually did.
<sourcesyntaxhighlight lang="asm">
MOV EDX, CR0 ; Start probe, get CR0
AND EDX, (-1) - (CR0_TS + CR0_EM) ; clear TS and EM to force fpu access
Line 31 ⟶ 32:
 
.testword: DW 0x55AA ; store garbage to be able to detect a change
</syntaxhighlight>
</source>
 
To distinguish a 287 and a 387 FPU, you can try if it can see the difference between +infinity and -infinity.
Line 41 ⟶ 42:
:If the EM bit is set, all FPU and vector operations will cause a #UD so they can be '''EM'''ulated in software. Should be off to be actually able to use the FPU
'''CR0.ET''' (bit 4)
:This bit is used on the 386 to tell it how to communicate with the coprocessor, which is 0 for an 287, and 1 for a 387 or later. This bit is hardwired to 1 on 486+
'''CR0.NE''' (bit 5)
:When set, enables '''N'''ative '''E'''xception handling which will use the FPU exceptions. When cleared, an exception is sent via the interrupt controller. Should be on for 486+, but not on 386s because they lack that bit.
Line 67 ⟶ 68:
== FPU state ==
When the FPU is configured, the only thing left to do is to initialize its registers to their proper states. FNINIT will reset the user-visible part of the FPU stack. This will set precision to 64-bit and rounding to nearest, which should be correct for most operations. It will also mask all exceptions from causing an interrupt. You can change the control by issuing an FLDCW. To diagnose broken code, you usually want to enable exceptions for invalid operands and stack overflows (bit 0). Bit 2 allows you to catch divisions by zero as well. Some examples:
<sourcesyntaxhighlight lang="asm">; FLDCW requires a 16-bit memory operand, immediates do not work
FLDCW [value_37F] ; writes 0x37f into the control word: the value written by F(N)INIT
FLDCW [value_37E] ; writes 0x37e, the default with invalid operand exceptions enabled
FLDCW [value_37A] ; writes 0x37a, both division by zero and invalid operands cause exceptions.</sourcesyntaxhighlight>
 
 
Line 90 ⟶ 91:
== Rent-a-coder ==
These functions can be used with GCC (or TCC) to perform some FPU operations without resorting to dedicated assembly:
<sourcesyntaxhighlight lang="c">void fpu_load_control_word(const uint16_t control)
{
asm volatile("fldcw %0;"::"m"(control));
}</sourcesyntaxhighlight>
 
==See Also==
===External Links===
* [http://www.website.masmforum.com/tutorials/fptute/ Simply FPU], a practical guide covering the FPU basics in ana userland perspective
* [http://www.ragestorm.net/downloads/387intel.txt Intel 80387 Programmer's Reference Manual], complete with example code
* [http://developer.amd.com/documentation/guides/pages/default.aspx#manuals AMD Programmer's Manuals], has FPU instruction reference conveniently ordered by processor component.
Line 103 ⟶ 104:
 
[[Category:X86]]
[[de:FPU_(x86)]]