Segmentation: Difference between revisions

m
Fix lint errors
[unchecked revision][unchecked revision]
No edit summary
m (Fix lint errors)
 
(14 intermediate revisions by 9 users not shown)
Line 3:
In [[Real Mode]] you use a logical address in the form A:B to address memory. This is translated into a physical address using the equation:
 
Physical address = (A * 0x10) + B
 
The registers in pure real-mode are limited to 16 bits for addressing. 16 bits can represent any integer between 0 and 64k. This means that if we set A to be a fixed value and allow B to change we can address a 64k area of memory. This 64k area is called a segment.
 
A = A 64k segment B = Offset within the segment
 
The base address of a segment is the (A * 0x10) portion of the equation I showed. It should be obvious that segments can overlap.
Line 17:
The x86 line of computers have 6 segment registers (CS, DS, ES, FS, GS, SS). They are totally independent of one another.
 
{| {{wikitable}}
*CS = Code Segment
|-
*DS = Data Segment
! CS
*SS = [[Stack]] Segment
*ES| = ExtraCode Segment
|-
*FS/GS = General Purpose Segments
! DS
*CS| = CodeData Segment
|-
! SS
*SS =| [[Stack]] Segment
|-
! ES
*DS| = DataExtra Segment
|-
! FS
*FS/GS| rowspan=2 | General Purpose Segments
|-
! GS
|}
 
DS, ES, FS, GS, SS are used to form addresses when you want to read/write to memory. They don't always have to be explicitly encoded, because some processor operations assume that certain segment registers will be used.
 
E.g.
Line 56 ⟶ 70:
 
==Protected Mode==
:''Segmentation is considered obsolete memory protection technique in protected mode by both CPU manufacturers and most of programmers. It is no longer supported in long mode. The information here is required to get protected mode working; also 64 bit GDT is needed to enter long mode and segments are still used to jump from long mode to compatibility mode and the other way around. If you want to be serious about OS development, we strongly recommend using flat memory model and [[Paging|paging]] as memory management technique. For more information, consult [[x86-64]].''
 
:''Read more about [[Global Descriptor Table]]''
 
In [[Protected mode]] you use a logical address in the form A:B to address memory. As in [[Real Mode]], A is the segment part and B is the offset within that segment. The registers in protected mode are limited to 32 bits. 32 bits can represent any integer between 0 and 4Gb4 GiB.
 
Because B can be any value between 0 and 4Gb4GiB our segments now have a maximum size of 4Gb4 GiB (Same reasoning as in real-mode).
 
Now for the difference.
Line 75 ⟶ 91:
* The segment presence (Is it present or not)
* The descriptor type (0 = system; 1 = code/data)
* The segment type (Code/Data/Read/Write/Accessed/Conforming/Non-Conforming/Expand-Up/[[Expand_Down|Expand-Down]])
 
For the purposes of this explanation I'm only interested in 3 things. The base address, the limit and the descriptor type.
Line 102 ⟶ 118:
==Notes Regarding C==
*Most C compilers assume a flat-memory model.
*In this model all the segments cover the full address space (Usuallyusually 0->4Gb on x86). In essence this means that we completely ignore the A part of our A:B logical address. The reason for this is that most processors don't actually have segmentation (Plusand it's a hell of a lotmuch easier for the compiler to optimiseoptimize).
*This leaves you with 2 descriptors per privilege level (usually Ring 0 and Ring 3 normally), one for code and one for data, which both describe precisely the same segment. The only difference being that the code descriptor is loaded into CS, and the data descriptor is used by all the other segment registers. The reason you need both a code and data descriptor is that the processor will not allow you to load CS with a data descriptor (This is to help with security when using a segmented memory model, and although useless in the flat-memory model it is still required because you can't turn off segmentation).
*In general if you want to use the segmentation mechanism, by having the different segment registers represent segments with different base addresses, you won't be able to use a modern C compiler, and may very well be restricted to just Assembly.
*So, if you're going to use C, do what the rest of the C world does, which is set up a flat-memory model, use paging, and ignore the fact that segmentation even exists.
 
==Notes Regarding Pascal[FPC]==
 
Above may apply in theory to FreePascal, however, in reality is ignored, if the compiler at all pays any attention to same.
The twin segments for CODE and DATA are used, and as specified above,therefore, required.Size limits, however, are respected.(does NOT have to be 4GB in length)
 
"*In general if you want to use the segmentation mechanism, by having the different segment registers represent segments with different base addresses, you won't be able to use a modern C compiler, and may very well be restricted to just Assembly."
 
This is simply NOT true for Freepascal.
 
The 'A in A:B' is what allows 48 and 64 bit pointer references, not only with Pascal's NewFrontier unit, but FreePascal as well(Word:Longint Pointer reference).
 
*Assumption of CODE and DATA occupying the same space, (at least with PAE NX bits and Paging units not used) allows ROGUE/virus like code in the first place to take advantage of the machine. INTEL Specs even say this. CODE and DATA must be KEPT separate. Microsoft still is plagued with this problem,despite having NX bits enabled even in the latest OSes.
 
==See Also==
=== Articles ===
[[Segment Limits#Segmentation|Segment Limits]]
 
=== Threads ===
 
===External Links===
*[http://mirror.href.com/thestarman/asm/debug/Segments.html Removing the Mystery from SEGMENT : OFFSET Addressing]
*[http://duartes.org/gustavo/blog/post/memory-translation-and-segmentation Aug 2008: Memory Translation and Segmentation] by Gustavo Duarte
 
[[Category:X86]]
[[Category:Memory management]]
[[Category:Memory Segmentation]]