Symmetric Multiprocessing: Difference between revisions

[unchecked revision][unchecked revision]
Content deleted Content added
m Bot: Replace deprecated source tag with syntaxhighlight
 
(7 intermediate revisions by 5 users not shown)
Line 22:
Some information (which may not be present on newer machines) dedicated for multiprocessing is available. First one must find the MP Floating Pointer Structure. It is aligned on a 16 byte boundary, and contains a signature at the start "_MP_" or 0x5F504D5F. The OS must search in the EBDA, the BIOS ROM space, and last kilobyte of "base memory"; the size of base memory is specified in a 2 byte value at 0x413 in kilobytes, minus 1K.
Here is what the structure looks like:
<sourcesyntaxhighlight lang="c">
struct mp_floating_pointer_structure {
char signature[4];
Line 34:
// virtual wire mode is; all other bits are reserved
}
</syntaxhighlight>
</source>
Here is what the configuration table, pointed to by the floating pointer structure looks like:
<sourcesyntaxhighlight lang="c">
struct mp_configuration_table {
char signature[4]; // "PCMP"
Line 52:
uint8_t reserved;
}
</syntaxhighlight>
</source>
After the configuration table, there are entry_count entries describing more information about the system, then after that there is an extended table. The entries are either 20 bytes to represent a processor, or 8 bytes for something else. Here are what the processor and IO APIC entries look like.
<sourcesyntaxhighlight lang="c">
struct entry_processor {
uint8_t type; // Always 0
Line 65:
uint64_t reserved;
}
</syntaxhighlight>
</source>
Here is an IO APIC entry.
<sourcesyntaxhighlight lang="c">
struct entry_io_apic {
uint8_t type; // Always 2
Line 75:
uint32_t address; // The memory mapped address of the IO APIC is memory
}
</syntaxhighlight>
</source>
For more information, see http://www.intel.com/design/pentium/datashts/24201606.pdf, chapter 4.
 
==Finding information using ACPI==
You should be able to find a [[MADT]](APIC) table in the [[ACPIRSDT]] table or in the [[XSDT]] table. The table havehas a list of local-APICs, number of which should be consistentthe withsame as the number of cores on your processor. Details of these tabletables are not listed here, but you can find them easily on this wiki.
 
==AP startup==
Line 86:
===Startup Sequence===
The MP specification contains a standard method to start an AP, however it is not recommended to be used, as it contains very precise timings, which, if done incorrectly, can lead to several problems. Brendan offers an alternative method, which should be done for each AP '''individually'''. First send an init IPI and wait 10 milliseconds. Then send a SIPI, and poll for a flag to be set by the AP's trampoline code with a timeout of 1 millisecond. If the timeout was reached, send another SIPI, and poll for the same flag, but this time with a timeout of 1 second. If the AP managed to set the flag, the BSP should set another flag to allow the AP to continue (probably to wait for the scheduler to have a process it needs executing).
 
However that alternative method is highly overcomplicated. It is much simpler to have it the other way around, just send the two SIPI and make the APs to wait for the BSP, see in the example code below.
 
===Timing===
The easiest method for the timings is to use the PIT's mode 0. Write 0x30 to IO port 0x43 (select mode 0 for counter 0), then write your count value to 0x40, LSB first (e.g. write 0xA9 then 0x4 for a millisecond). To check if counter has finished, write 0xE2 to IO port 0x43, then read a status byte from port 0x40. If the 7th bit is set, then it has finished.
===Sending IPIs===
IPIs are sent through the BSP's LAPIC. Find the LAPIC base address from the MP tables or ACPI tables, then you can write 32-bit words to base + 0x300 and base + 0x310 to send IPIs. For a init IPI or startup IPI, you must first write the target LAPIC ID into bits 24-27 of base + 0x310. Then, for an init IPI, write 0x00004500 to base + 0x300. For a SIPI, write 0x00004600, ored with the page number at which you want to AP to start executing, to base + 0x300. For more information, see http://wiki.osdev.org/APIC#Local_APIC_registers.
 
==Example Code==
When you start testing your SMP code on real machines, you'll realize that they do not keep the standard. You must not do Broadcast INIT IPIs nor Broadcast SIPIs. The following example has a decant amount of error checking to be used on real hardware. It needs three variables
* ''numcores'' the number of valid cores
* ''lapic_ids'' an array of Local APIC IDs, numcores element
* ''lapic_ptr'' the pointer to the Local APIC registers
You can get these from the PCMP table above, or see the example code on the [[MADT]] page.
 
=== BSP Initialization Code ===
<syntaxhighlight lang="c">
volatile uint8_t aprunning = 0; // count how many APs have started
uint8_t bspid, bspdone = 0; // BSP id and spinlock flag
// get the BSP's Local APIC ID
__asm__ __volatile__ ("mov $1, %%eax; cpuid; shrl $24, %%ebx;": "=b"(bspid) : : );
 
// copy the AP trampoline code to a fixed address in low conventional memory (to address 0x0800:0x0000)
memcpy((void*)0x8000, &ap_trampoline, 4096);
 
// for each Local APIC ID we do...
for(i = 0; i < numcores; i++) {
// do not start BSP, that's already running this code
if(lapic_ids[i] == bspid) continue;
// send INIT IPI
*((volatile uint32_t*)(lapic_ptr + 0x280)) = 0; // clear APIC errors
*((volatile uint32_t*)(lapic_ptr + 0x310)) = (*((volatile uint32_t*)(lapic_ptr + 0x310)) & 0x00ffffff) | (i << 24); // select AP
*((volatile uint32_t*)(lapic_ptr + 0x300)) = (*((volatile uint32_t*)(lapic_ptr + 0x300)) & 0xfff00000) | 0x00C500; // trigger INIT IPI
do { __asm__ __volatile__ ("pause" : : : "memory"); }while(*((volatile uint32_t*)(lapic_ptr + 0x300)) & (1 << 12)); // wait for delivery
*((volatile uint32_t*)(lapic_ptr + 0x310)) = (*((volatile uint32_t*)(lapic_ptr + 0x310)) & 0x00ffffff) | (i << 24); // select AP
*((volatile uint32_t*)(lapic_ptr + 0x300)) = (*((volatile uint32_t*)(lapic_ptr + 0x300)) & 0xfff00000) | 0x008500; // deassert
do { __asm__ __volatile__ ("pause" : : : "memory"); }while(*((volatile uint32_t*)(lapic_ptr + 0x300)) & (1 << 12)); // wait for delivery
mdelay(10); // wait 10 msec
// send STARTUP IPI (twice)
for(j = 0; j < 2; j++) {
*((volatile uint32_t*)(lapic_ptr + 0x280)) = 0; // clear APIC errors
*((volatile uint32_t*)(lapic_ptr + 0x310)) = (*((volatile uint32_t*)(lapic_ptr + 0x310)) & 0x00ffffff) | (i << 24); // select AP
*((volatile uint32_t*)(lapic_ptr + 0x300)) = (*((volatile uint32_t*)(lapic_ptr + 0x300)) & 0xfff0f800) | 0x000608; // trigger STARTUP IPI for 0800:0000
udelay(200); // wait 200 usec
do { __asm__ __volatile__ ("pause" : : : "memory"); }while(*((volatile uint32_t*)(lapic_ptr + 0x300)) & (1 << 12)); // wait for delivery
}
}
// release the AP spinlocks
bspdone = 1;
// now you'll have the number of running APs in 'aprunning'
</syntaxhighlight>
 
=== AP Initialization Code ===
As the application processors start up in real mode, a little Assembly is needed to enter protected mode. Modify this example to your kernel's needs.
<syntaxhighlight lang="asm">
; this code will be relocated to 0x8000, sets up environment for calling a C function
.code16
ap_trampoline:
cli
cld
ljmp $0, $0x8040
.align 16
_L8010_GDT_table:
.long 0, 0
.long 0x0000FFFF, 0x00CF9A00 ; flat code
.long 0x0000FFFF, 0x008F9200 ; flat data
.long 0x00000068, 0x00CF8900 ; tss
_L8030_GDT_value:
.word _L8030_GDT_value - _L8010_GDT_table - 1
.long 0x8010
.long 0, 0
.align 64
_L8040:
xorw %ax, %ax
movw %ax, %ds
lgdtl 0x8030
movl %cr0, %eax
orl $1, %eax
movl %eax, %cr0
ljmp $8, $0x8060
.align 32
.code32
_L8060:
movw $16, %ax
movw %ax, %ds
movw %ax, %ss
; get our Local APIC ID
mov $1, %eax
cpuid
shrl $24, %ebx
movl %ebx, %edi
; set up 32k stack, one for each core. It is important that all core must have its own stack
shll $15, %ebx
movl stack_top, %esp
subl %ebx, %esp
pushl %edi
; spinlock, wait for the BSP to finish
1: pause
cmpb $0, bspdone
jz 1b
lock incb aprunning
; jump into C code (should never return)
ljmp $8, $ap_startup
</syntaxhighlight>
<syntaxhighlight lang="C">
// this C code can be anywhere you want it, no relocation needed
void ap_startup(int apicid) {
// do what you want to do on the AP
while(1);
}
</syntaxhighlight>
 
==See Also==
Line 99 ⟶ 206:
 
===External Links===
*[httphttps://web.archive.org/web/20170410220205/https://download.intel.com/design/archives/processors/pro/docs/24201606.pdf Intel Multiprocessor Specification]
*[http://www.osdever.net/tutorials/view/multiprocessing-support-for-hobby-oses-explained Multiprocessing Support for Hobby OSes Explained at BonaFide]