MZ: Difference between revisions

From OSDev.wiki
Jump to navigation Jump to search
[unchecked revision][unchecked revision]
Content added Content deleted
m (+ executable formats infobox)
 
(19 intermediate revisions by 10 users not shown)
Line 1: Line 1:
{{Stub}}
{{File formats}}
{{File formats}}


The MS-DOS EXE format, also known as MZ after its signature (the initials of Microsoft engineer Mark Zbikowski), was introduced with PC-DOS 1.0 (pre-release version 0.90 only sported the simple [[COM]] format; note in DOS 1.x the DOS kernel doesn't support EXEs and the EXE loader is in COMMAND.COM; in DOS 2.x and later it was moved into the DOS kernel). It is designed as a relocatable executable running under real mode. As such, only DOS and Windows 9x/Me can use this format natively, but there are several free DOS emulators (e.g., [http://www.dosbox.com/ DOSBox]) that support it and that run under various operating systems (e.g., Linux, Amiga, Windows NT, etc.). Although they can exist on their own, MZ executables are embedded in all [[NE]], [[LE]], and [[PE]] executables, usually as stubs so that when they are ran under DOS, they display:
DOS-MZ was introduced with MS-DOS (Not DOS v1, however.) as a companion to the simplified DOS COM file format. DOS-MZ was designed to be run in real mode and reflects this, having a relocation table of SEGMENT:OFFSET pairings. A very simple format that can be run at any offset, it does not distinguish between TEXT, DATA and BSS. Since it was designed to run in real mode, its maximum file size of code + data + bss is 1mb.


This program cannot be run in MS-DOS mode.
Only DOS and Windows use this format natively, but there are several free DOS emulators that support it for various operating systems. (e.g. Linux, Amiga, etc.)

However, they can also be used so that a single executable can provide 2 ports of the same application (e.g. one for DOS and one for Windows). Windows 9x will run the MZ executable if the program is started from the command line prompt, or the PE executable if started normally. In the case of boot loaders, they can help provide a DOS version, especially since UEFI requires the PE format, which contains a MZ executable.

==MZ File Structure==

MZ executables only consists of 2 structures: the header and the relocation table. The header, which is followed by the program image, looks like this:

{| {{wikitable}}
|-
! colspan=2|Offset
! Field
! Size
! Description
|-
| 0
| 0x00
| Signature
| word
| 0x5A4D (ASCII for 'M' and 'Z')
|-
| 2
| 0x02
| Extra bytes
| word
| Number of bytes in the last page.
|-
| 4
| 0x04
| Pages
| word
| Number of whole/partial pages.
|-
| 6
| 0x06
| Relocation items
| word
| Number of entries in the relocation table.
|-
| 8
| 0x08
| Header size
| word
| The number of paragraphs taken up by the header. It can be any value, as the loader just uses it to find where the actual executable data starts. It may be larger than what the "standard" fields take up, and you may use it if you want to include your own header metadata, or put the relocation table there, or use it for any other purpose.
|-
| 10
| 0x0A
| Minimum allocation
| word
| The number of paragraphs '''required''' by the program, excluding the PSP and program image. If no free block is big enough, the loading stops.
|-
| 12
| 0x0C
| Maximum allocation
| word
| The number of paragraphs '''requested''' by the program. If no free block is big enough, the biggest one possible is allocated.
|-
| 14
| 0x0E
| Initial SS
| word
| Relocatable segment address for SS.
|-
| 16
| 0x10
| Initial SP
| word
| Initial value for SP.
|-
| 18
| 0x12
| Checksum
| word
| When added to the sum of all other words in the file, the result should be zero.
|-
| 20
| 0x14
| Initial IP
| word
| Initial value for IP.
|-
| 22
| 0x16
| Initial CS
| word
| Relocatable segment address for CS.
|-
| 24
| 0x18
| Relocation table
| word
| The (absolute) offset to the relocation table.
|-
| 26
| 0x1A
| Overlay
| word
| Value used for overlay management. If zero, this is the main executable.
|-
| 28
| 0x1C
| Overlay information
| N/A
| Files sometimes contain extra information for the main's program overlay management.
|}

A paragraph is 16 bytes in size. A page (or block) is 512 bytes long.

If both the minimum and maximum allocation fields are cleared, MS-DOS will attempt to load the executable as high as possible in memory. Otherwise, the image will be loaded just above the 256-byte PSP structure, in low memory.

===Relocations===

After loading the executable into memory, the program loader goes through every entry in relocation table. For each relocation entry, the loader adds the start segment address into word value pointed to by the segment:offset pair. So, for example, a relocation entry 0001:001A will make the loader add start segment address to the value at offset 1*0x10+0x1A=0x2A within the program data.

Each pointer in the relocation table looks as such:

{| {{wikitable}}
|-
! colspan=2|Offset
! Field
! Size
! Description
|-
| 0
| 0x00
| Offset
| word
| Offset of the relocation within provided segment.
|-
| 2
| 0x02
| Segment
| word
| Segment of the relocation, relative to the load segment address.
|}

CS and SS registers are relocated in a similar fashion.

==Initial Program State==
* ES and DS registers both point to the segment containing the PSP structure.
* CS equals value specified in the header, relocated by adding the start segment address to it.
* IP equals value specified in the header. Note, that unlike in [[COM]] executables, MZ programs don't start at offset 0x100.
* SS equals value specified in the header, relocated, just like CS.
* SP equals value specified in the header.
* AL is 0x00 if the first FCB in the PSP has a valid drive identifier, 0xFF otherwise.
* AH is the same as AL, but for the second FCB in the PSP.
* All other registers may, or may not be set to 0. You should consider them undefined.

==PE Extension==

With the advent of the [[PE]] executable, Microsoft added items to the MZ header, as defined in WinNT.h

(Actually, that's incorrect – this extension was originally added in the New Executable format, which was first released as part of Windows 1.0 in 1985, and the multitasking MS-DOS 4.0 in 1986. The "PE header start" is called "e_lfanew" in Microsoft's headers, since it was originally the pointer to the NE header, although later was also used for LE, LX and PE; in principle it can be used for *any* executable format which is defined as an extension of MZ.)

{| {{wikitable}}
|-
! colspan=2|Offset
! Field
! Size
! Description
|-
| 28
| 0x1C
| Reserved
| qword
|
|-
| 36
| 0x24
| OEM identifier
| word
| Defined by name but no other information is given; typically zeroes
|-
| 38
| 0x26
| OEM info
| word
| Defined by name but no other information is given; typically zeroes
|-
| 40
| 0x28
| Reserved
| 20 bytes
|-
| 60
| 0x3C
| PE header start
| dword
| Starting address of the PE header
|}


==See Also==
==See Also==
* [http://www.nondot.org/sabre/os/files/Executables/EXE.txt OSRC]
* [http://web.archive.org/web/20120204110033/http://www.nondot.org/sabre/os/files/Executables/EXE.txt OSRC]
* [http://www.delorie.com/djgpp/doc/exe/ D.J. Delorie: MZ Header format]
* [http://www.delorie.com/djgpp/doc/exe/ D.J. Delorie: MZ Header format]
* [http://www.pinvoke.net/default.aspx/Structures.IMAGE_DOS_HEADER IMAGE_DOS_HEADER]
* [https://marcin-chwedczuk.github.io/a-closer-look-at-portable-executable-msdos-stub A closer look at Portable Executable MS-DOS Stub]
* [https://github.com/FDOS/kernel/blob/master/kernel/task.c#L601 FreeDOS kernel's MZ loader source code]


[[Category:Executable Formats]]
[[Category:Executable Formats]]

Latest revision as of 16:44, 28 February 2024

Executable Formats
Microsoft

16 bit:
COM
MZ
NE
Mixed (16/32 bit):
LE
32/64 bit:
PE
COFF

*nix
Apple

The MS-DOS EXE format, also known as MZ after its signature (the initials of Microsoft engineer Mark Zbikowski), was introduced with PC-DOS 1.0 (pre-release version 0.90 only sported the simple COM format; note in DOS 1.x the DOS kernel doesn't support EXEs and the EXE loader is in COMMAND.COM; in DOS 2.x and later it was moved into the DOS kernel). It is designed as a relocatable executable running under real mode. As such, only DOS and Windows 9x/Me can use this format natively, but there are several free DOS emulators (e.g., DOSBox) that support it and that run under various operating systems (e.g., Linux, Amiga, Windows NT, etc.). Although they can exist on their own, MZ executables are embedded in all NE, LE, and PE executables, usually as stubs so that when they are ran under DOS, they display:

   This program cannot be run in MS-DOS mode.

However, they can also be used so that a single executable can provide 2 ports of the same application (e.g. one for DOS and one for Windows). Windows 9x will run the MZ executable if the program is started from the command line prompt, or the PE executable if started normally. In the case of boot loaders, they can help provide a DOS version, especially since UEFI requires the PE format, which contains a MZ executable.

MZ File Structure

MZ executables only consists of 2 structures: the header and the relocation table. The header, which is followed by the program image, looks like this:

Offset Field Size Description
0 0x00 Signature word 0x5A4D (ASCII for 'M' and 'Z')
2 0x02 Extra bytes word Number of bytes in the last page.
4 0x04 Pages word Number of whole/partial pages.
6 0x06 Relocation items word Number of entries in the relocation table.
8 0x08 Header size word The number of paragraphs taken up by the header. It can be any value, as the loader just uses it to find where the actual executable data starts. It may be larger than what the "standard" fields take up, and you may use it if you want to include your own header metadata, or put the relocation table there, or use it for any other purpose.
10 0x0A Minimum allocation word The number of paragraphs required by the program, excluding the PSP and program image. If no free block is big enough, the loading stops.
12 0x0C Maximum allocation word The number of paragraphs requested by the program. If no free block is big enough, the biggest one possible is allocated.
14 0x0E Initial SS word Relocatable segment address for SS.
16 0x10 Initial SP word Initial value for SP.
18 0x12 Checksum word When added to the sum of all other words in the file, the result should be zero.
20 0x14 Initial IP word Initial value for IP.
22 0x16 Initial CS word Relocatable segment address for CS.
24 0x18 Relocation table word The (absolute) offset to the relocation table.
26 0x1A Overlay word Value used for overlay management. If zero, this is the main executable.
28 0x1C Overlay information N/A Files sometimes contain extra information for the main's program overlay management.

A paragraph is 16 bytes in size. A page (or block) is 512 bytes long.

If both the minimum and maximum allocation fields are cleared, MS-DOS will attempt to load the executable as high as possible in memory. Otherwise, the image will be loaded just above the 256-byte PSP structure, in low memory.

Relocations

After loading the executable into memory, the program loader goes through every entry in relocation table. For each relocation entry, the loader adds the start segment address into word value pointed to by the segment:offset pair. So, for example, a relocation entry 0001:001A will make the loader add start segment address to the value at offset 1*0x10+0x1A=0x2A within the program data.

Each pointer in the relocation table looks as such:

Offset Field Size Description
0 0x00 Offset word Offset of the relocation within provided segment.
2 0x02 Segment word Segment of the relocation, relative to the load segment address.

CS and SS registers are relocated in a similar fashion.

Initial Program State

  • ES and DS registers both point to the segment containing the PSP structure.
  • CS equals value specified in the header, relocated by adding the start segment address to it.
  • IP equals value specified in the header. Note, that unlike in COM executables, MZ programs don't start at offset 0x100.
  • SS equals value specified in the header, relocated, just like CS.
  • SP equals value specified in the header.
  • AL is 0x00 if the first FCB in the PSP has a valid drive identifier, 0xFF otherwise.
  • AH is the same as AL, but for the second FCB in the PSP.
  • All other registers may, or may not be set to 0. You should consider them undefined.

PE Extension

With the advent of the PE executable, Microsoft added items to the MZ header, as defined in WinNT.h

(Actually, that's incorrect – this extension was originally added in the New Executable format, which was first released as part of Windows 1.0 in 1985, and the multitasking MS-DOS 4.0 in 1986. The "PE header start" is called "e_lfanew" in Microsoft's headers, since it was originally the pointer to the NE header, although later was also used for LE, LX and PE; in principle it can be used for *any* executable format which is defined as an extension of MZ.)

Offset Field Size Description
28 0x1C Reserved qword
36 0x24 OEM identifier word Defined by name but no other information is given; typically zeroes
38 0x26 OEM info word Defined by name but no other information is given; typically zeroes
40 0x28 Reserved 20 bytes
60 0x3C PE header start dword Starting address of the PE header

See Also