Symmetric Multiprocessing: Difference between revisions

From OSDev.wiki
Jump to navigation Jump to search
[unchecked revision][unchecked revision]
Content added Content deleted
(added information how to find the MP tables)
Line 16: Line 16:


You can broadcast these signals across the bus to start every device that is present. However by doing so you might also enable the processors that were disabled on purpose (because they were defective).
You can broadcast these signals across the bus to start every device that is present. However by doing so you might also enable the processors that were disabled on purpose (because they were defective).

==Finding information==
There is information about a computer's processors, APICs, etc. in the ACPI tables. However, some information (which may not be present on newer machines) dedicated for multiprocessing is available. First one must find the MP Floating Pointer Structure. It is aligned on a 16 byte boundary, and contains a signature at the start "_MP_" or 0x5F504D5F. The OS must search in the EBDA, the BIOS ROM space, and last kilobyte of "base memory"; the size of base memory is specified in a 2 byte value at 0x413 in kilobytes, minus 1K.
Here is what the structure looks like:
<source lang="c">
struct mp_floating_pointer_structure {
char signature[4];
uint32_t configuration_table;
uint8_t length; // In 16 bytes (e.g. 1 = 16 bytes, 2 = 32 bytes)
uint8_t mp_specification_revision;
uint8_t checksum; // This value should make all bytes in the table equal 0 when added together
uint8_t default_configuration; // If this is not zero then configuration_table should be
// ignored and a default configuration should be loaded instead
uint32_t features; // If bit 7 is then the IMCR is present and PIC mode is being used, otherwise
// virtual wire mode is; all other bits are reserved
}
</source>
Here is what the configuration table, pointed to by the floating pointer structure looks like:
<source lang="c">
struct mp_configuration_table {
char signature[4]; // "PCMP"
uint16_t length;
uint8_t mp_specification_revision;
uint8_t checksum; // Again, the byte should be all bytes in the table add up to 0
char oem_id[8];
char product_id[12];
uint32_t oem_table;
uint16_t oem_table_size;
uint16_t entry_count; // This value represents how many entries are following this table
uint32_t lapic_address; // This is the memory mapped address of the local APICs
uint16_t extended_table_length;
uint8_t extended_table_checksum;
uint8_t reserved;
}
</source>
After the configuration table, there are entry_count entries describing more information about the system, then after that there is an extended table. The entries are either 20 bytes to represent a processor, or 8 bytes for something else. Here are what the processor and IO APIC entries look like.
<source lang="c">
struct entry_processor {
uint8_t type; // Always 0
uint8_t local_apic_id;
uint8_t local_apic_version;
uint8_t flags; // If bit 0 is set then the processor must be ignored
// If bit 1 is set then the processor is the bootstrap processor
uint32_t signature;
uint32_t feature_flags;
uint64_t reserved;
}
</source>
Here is an IO APIC entry.
<source lang="c">
struct entry_io_apic {
uint8_t type; // Always 2
uint8_t id;
uint8_t version;
uint8_t flags; // If bit 0 is set then the entry should be ignored
uint32_t address; // The memory mapped address of the IO APIC is memory
}
</source>
For more information, see http://www.intel.com/design/pentium/datashts/24201606.pdf, chapter 4.


==See Also==
==See Also==

Revision as of 08:54, 18 February 2015

This page is a stub.
You can help the wiki by accurately adding more contents to it.

Symmetric Multiprocessing (or SMP) is one method of having multiple processors in one computer system. In an SMP system (as opposed to a NUMA system) all logical cores are able to see the entire memory for the system. Note that SMP and NUMA are not mutually exclusive however; as Brendan has pointed out on the forums, Intel's Core i7 implements both SMP and NUMA, as well as hyper-threading.

Initialisation of an old SMP system

The startup sequence is different for different CPUs. Intel's system programmer's manual (section 7.5.4) contains the initialization protocol for Intel Xeon processors, and doesn't cover older CPUs. For the generic "all CPU types" algorithm, see Intel's Multi-processor Specification.

For 80486 (with an external 8249DX local APIC), you must use an INIT IPI followed by an "INIT level de-assert" IPI without any SIPI's. This means you can't tell them where to start executing (the vector part of a SIPI) and they always start executing BIOS code. In this case you set the BIOS's CMOS reset value to "warm start with far jump" (i.e. set CMOS location 0x0F to the value 10) so that the BIOS will do a jmp far ~[0:0x0469]", and then put the segment & offset of your AP entry point at 0x0469.

The "INIT level de-assert" IPI isn't supported on newer CPUs (Pentium 4 and Intel Xeon), and AFAIK it is ignored completely on these CPUs.

For newer CPUs (P6, Pentium 4) one SIPI is enough, but I'm not sure if older Intel CPUs (Pentium) or CPUs from other manufacturers need a second SIPI or not. It's also possible that the second SIPI is there in case there's a delivery failure for the first SIPI (bus noise, etc).

I normally send the first SIPI and then wait to see if the AP CPU increases a "number of started CPUs" counter. If it doesn't increase this counter within a few milli-seconds, then I send the second SIPI. This is different to Intel's generic algorithm (which has a 200 micro-second delay between SIPIs), but trying to find a time source capable of accurately measuring a 200 micro-second delay during early boot isn't so easy. I've also found that on real hardware, if the delay between SIPIs is too long (and you don't use my method) an AP CPU can run the OS's early AP startup code twice (which in my case would lead to the OS thinking there's twice as many AP CPUs as there are).

You can broadcast these signals across the bus to start every device that is present. However by doing so you might also enable the processors that were disabled on purpose (because they were defective).

Finding information

There is information about a computer's processors, APICs, etc. in the ACPI tables. However, some information (which may not be present on newer machines) dedicated for multiprocessing is available. First one must find the MP Floating Pointer Structure. It is aligned on a 16 byte boundary, and contains a signature at the start "_MP_" or 0x5F504D5F. The OS must search in the EBDA, the BIOS ROM space, and last kilobyte of "base memory"; the size of base memory is specified in a 2 byte value at 0x413 in kilobytes, minus 1K. Here is what the structure looks like:

struct mp_floating_pointer_structure {
    char signature[4];
    uint32_t configuration_table;
    uint8_t length; // In 16 bytes (e.g. 1 = 16 bytes, 2 = 32 bytes)
    uint8_t mp_specification_revision;
    uint8_t checksum; // This value should make all bytes in the table equal 0 when added together
    uint8_t default_configuration; // If this is not zero then configuration_table should be 
                                   // ignored and a default configuration should be loaded instead
    uint32_t features; // If bit 7 is then the IMCR is present and PIC mode is being used, otherwise 
                       // virtual wire mode is; all other bits are reserved
}

Here is what the configuration table, pointed to by the floating pointer structure looks like:

struct mp_configuration_table {
    char signature[4]; // "PCMP"
    uint16_t length;
    uint8_t mp_specification_revision;
    uint8_t checksum; // Again, the byte should be all bytes in the table add up to 0
    char oem_id[8];
    char product_id[12];
    uint32_t oem_table;
    uint16_t oem_table_size;
    uint16_t entry_count; // This value represents how many entries are following this table
    uint32_t lapic_address; // This is the memory mapped address of the local APICs 
    uint16_t extended_table_length;
    uint8_t extended_table_checksum;
    uint8_t reserved;
}

After the configuration table, there are entry_count entries describing more information about the system, then after that there is an extended table. The entries are either 20 bytes to represent a processor, or 8 bytes for something else. Here are what the processor and IO APIC entries look like.

 
struct entry_processor {
    uint8_t type; // Always 0
    uint8_t local_apic_id;
    uint8_t local_apic_version;
    uint8_t flags; // If bit 0 is set then the processor must be ignored
                   // If bit 1 is set then the processor is the bootstrap processor
    uint32_t signature;
    uint32_t feature_flags;
    uint64_t reserved;
}

Here is an IO APIC entry.

struct entry_io_apic {
    uint8_t type; // Always 2
    uint8_t id;
    uint8_t version;
    uint8_t flags; // If bit 0 is set then the entry should be ignored
    uint32_t address; // The memory mapped address of the IO APIC is memory
}

For more information, see http://www.intel.com/design/pentium/datashts/24201606.pdf, chapter 4.

See Also

Articles

Threads

External Links