File Allocation Table (FAT) was introduced with DOS v1.0 (and possibly CP/M), supposedly written by Bill Gates. FAT is a very simple filesystem which is nothing more than a singular linked list of clusters. FAT filesystems use very little memory and is one of, if not the, most basic filesystem in existence today.

About FAT

There are two versions of this simplified FAT, FAT12 and FAT16. FAT12 was designed for floppy disks and can manage a maximum size of 16mb using 12bit cluster numbers. FAT16 was designed for early hard disks and could handle a maximum size of 64kb * cluster_size. The larger the hard disk, the larger the cluster size would be, which lead to large amounts of "slack space" on the disk.

FAT12+FAT16 filesystems have fixed size for filenames of "8.3" and limited support for file attributes. You could also read the FAT12 document. You could also check out the FAT tutorial reported by Kemp.

Note that the FAT filesystem is covered by software patents.

VFAT

VFAT is an extension of FAT16 and FAT12 that has the ability to use long filenames (up to 255 characters i think). First introduced by Windows95, it uses a "cludge" whereby long filenames are marked with an "volume label"; attribute and filenames are subsequently stored in the 8.3 format in sequential directory entries. (This is a bit of an oversimplification, but close enough).

FAT32

FAT32 was introduced to us by Windows95-B and Windows98. FAT32 solved some of FAT's problems. No more 64kb max clusters! FAT32, as its name suggests, can handle a maximum of 4 gibiclusters per partition. This enables very large hard disks to still maintain very small cluster sizes and thus reduce slack space between files.

N.B. FAT32 is actually only FAT28. The top 4 bits are, according to Microsoft's specification, "Undefined". This does not mean that the top four bits should be "0". When a FAT entry is altered the top 4 bits should be left unchanged.

Implementation Details

The File Allocation Table (FAT) is a table stored on every hard or floppy disk that indicates the status and location of all data clusters that are on the disk. The File Allocation Table can be considered to be the "table of contents" of a disk. If the file allocation table is damaged or lost, then a disk is unreadable. In a file server the FAT data is sometimes kept in the computer RAM for quick access and is easily lost if the system crashes as the result of a power failure.

The File Allocation Table is maintained by the operating system that provides a map of the clusters (the basic unit of logical storage on a disk) that a file has been stored in. When you write a new file, the file is stored in one or more clusters that are not necessarily next to each other; they may be rather widely scattered over the disk. A typical cluster size is 2,048 Bytes, 4,096 Bytes or 8,192 Bytes.* The operating system creates a FAT entry for the new file that records where each cluster is located and their sequential order. When you read a file, the operating system reassembles the file from clusters and places it as an entire file where you want to read it.

The hard disk is physically arranged by cylinders, heads, and sectors, that is how it is addressed by the hardware controller and the ROM BIOS, which addresses it at a physical level. For the operating system and other programs, however, this is cumbersome, since the physical number of cylinders, heads, and sectors varies from disk to disk. It would be convenient to view the disk as simply a large continuous block of sectors with simple sequential addresses.

MS-DOS does, in fact, view the sectors on a disk as a one-dimensional array of sectors numbered from 0 to n-1, where n is the total number of sectors on the disk. It therefore must translate from the logical sector numbers to physical to physical cylinder-head-sector, or CHS addresses. In doing so, MS-DOS sequentially numbers all the sectors of head 0, cylinder 0, then all the sectors of head 1, cylinder 0, and so on for each head, and then repeats this for each cylinder, to the end of the disk.

Furthermore, MS-DOS logically divides this array of sectors into five distinct areas, which are, in the order they appear on the disk,

  • The partition table,
  • The boot record,
  • The File Allocation Table (FAT),
  • The root directory, and
  • The data area.

The first four areas of the disk, collectively called the system area, are used by MS-DOS to keep track of the contents of the disk. The largest area of the disk, the data area, is where all user files and data reside. MS-DOS uses a special numbering scheme for the area called cluster numbering which is in addition to, but independent of logical sector numbers.

The boot record occupies one sector, and is always placed in logical sector number (LSN) 0, which is physically cylinder 0, head 0, sector 1, the first sector of the first head of the first cylinder on the disk. This is the easiest sector on the disk for the computer to locate when it begins running.

The File Allocation Table (FAT) is an array of integers in which each element represents one cluster in the data area. For each cluster in the data area the corresponding entry in the FAT contains a code which indicates the status of the cluster. The cluster may be available for use, it may be reserved by the operating system, it may be unavailable due to a bad sector on the disk, or it may be in use by a file.

MS-DOS maintains a hierarchical directory structure in which there is one entry for every file on the disk.

Data area is where all user files and data reside.

FAT 12 (Diskette)

Boot Sector

Contents

BYTE      +0 +1 +2 +3 +4 +5 +6 +7 meaning
0-2 0000 eb 3c 90 .. .. .. .. .. Jump to start of boot code
3 - 10 0000 .. .. .. 6d 6b 64 6f 73 OEM identifier (mkdosfs)
0008 66 73 00 .. .. .. .. .. (program/OS being used to format)
11 - 12 0008 .. .. .. 00 02 .. .. .. The number of Bytes per sector (512)
13 0008 .. .. .. .. .. 01 .. .. Number of sectors per allocation unit (cluster)
14 - 15 0008 .. .. .. .. .. .. 01 00 Number of reserved sectors
16 0010 02 .. .. .. .. .. .. .. Number of FAT's on the diskette
17 - 18 0010 .. e0 00 .. .. .. .. .. Number of directory entries (must be set so that the root directory occupies entire sectors)
19 - 20 0010 .. .. .. 40 0b .. .. .. The total sectors in the logical volume
21 0010 .. .. .. .. .. f0 .. .. Media descriptor type
22 - 23 0010 .. .. .. .. .. .. 09 00 Number of sectors per FAT
24 - 25 0018 12 00 .. .. .. .. .. .. Number of sectors per track
26 - 27 0018 .. .. 02 00 .. .. .. .. Number of heads or sides on the diskette
28 - 31 0018 .. .. .. .. 00 00 00 00 Number of hidden sectors
32 - 35 0020 00 00 00 00 .. .. .. .. Large amount of sector on media
36 0020 .. .. .. .. 00 .. .. .. Drive number
37 0020 .. .. .. .. .. 00 .. .. Flags
38 0020 .. .. .. .. .. .. 29 .. Signature (must be 0x28 or 0x29)
39 - 42 <disk-dependent> VolumeID 'Serial' number (ignore this)
43 - 53 0028 .. .. .. 20 20 20 20 20 Volume label,
0030 20 20 20 20 20 20 .. .. padded with spaces
54 - 61 0030 .. .. .. .. .. .. F  A system identifier
0038 T  1  2  20 20 20 .. .. (padded with space)
62-509 0038 .. .. .. .. .. .. 0e 1f Start of Bootstrap routine
0040 be 5b 7c ac 22 c0 74 0b Bootstrap routine (cont'd)
510 01f0 .. .. .. .. .. .. 55 aa BIOS boot Signature (dw 0xAA55)

Reading the Boot Sector

Bytes (0-2)

The first three bytes 6B 3C and 90 disassemble to JMP SHORT 3C NOP. The reason for this is to jump over the disk format information. Since the first sector of the disk is loaded into ram at location 0x0000:0x7c00 and executed, without this jump, the processor would attempt to execute data that isn't code.

Bytes (3 - 10)

The first 8 Bytes (3 - 10) is the version of DOS being used. The next eight Bytes 29 3A 63 7E 2D 49 48 and 43 read out the name of the version. The official FAT Specification from Microsoft says that this field is really meaningless and is ignored by MS FAT Drivers, however it does recommend the value "MSWIN4.1" as some 3rd party drivers supposedly check it and expect it to have that value. Older versions of dos also report MSDOS5.1 and linux-formatted floppy will likely to carry "mkdosfs" here. If the string is less than 8 bytes, it is padded with zeroes.

Bytes (11 - 12)

The next two Bytes (11 - 12), 00 and 02, is the number of Bytes per sector. The first thing you do when reading a pair of Bytes is reverse them to read 02 00. 0200 is the number of Bytes per sector in hexadecimal or 512 Bytes per sector in decimal.

Byte 13

Byte 13 is the number of sectors per allocation (cluster). In this case it is one.

Bytes (14 - 15)

These two Bytes, 01 and 00, indicate the number of reserved sectors. Again you must reverse the Bytes to 00 01. There is one reserved sector.

Byte 16

This is the first Byte of the second row and it indicates the number of FAT's on the diskette. There are two.

Bytes (17 -18)

This indicates the number of directory entries. Reversing the Bytes E0 00 to 00 E0 and converting the number to decimal we have 224 directory entries.

Bytes (19 - 20)

The total sectors in the logical volume. Reversing 40 and 0B to 0B 40 and converting the number to decimal we have 2880 sectors in the logical volume. If that value is 0, it means there are more than 65535 sectors in the volume, and the actual count is stored in "Large Sectors (bytes 32-35).

Byte 21

This Byte (F0) indicates the media descriptor type, which is here a 1.44MB floppy.

Bytes (22 - 23)

These two Bytes, 09 and 00, indicate the number of sectors per FAT. Reversing 09 and 00 to 00 09, we see that we have nine sectors per FAT.

Bytes (24 - 25)

Number of sectors per track. Reversing the Bytes 12 and 00 to 00 12, there are eighteen sectors per track.

Bytes (26 - 27)

These two Bytes indicate the number of heads or sides on the diskette. Reversing the Bytes 02 and 00 to 00 02, we see that there are two sides to the diskette.

Bytes (28 - 29)

Number of hidden sectors. Both Bytes read zero, no hidden sectors.

Byte 30

Start of bootstrap routine is zero.

Directory

ToDo: this information is weak, lacks clarifications about padding (how is A.B exactly encoded), what time and dates refers, etc.

Contents

Bytes Meaning
0 - 10 File name with extension
11 Attributes of the file
12 - 21 Reserved Bytes
22 - 23 Indicate the time
24 - 25 Indicate the date
26 - 27 Indicate the entry cluster value
28 - 31 Indicate the size of the file

Reading the Directory

We had a thread where it shows that root directory might not be that simple. It seems like we should read all entries, skipping entries marked as 'volume label' if any. -- PypeClicker

Bytes (0 - 10)

Starting with the Byte 2620, the first 11 Bytes (0 - 10) is the name of the file with extension. If the 11 byte string is PROCESSATXT, then the 8.3 filename is PROCESSA.TXT since the first 8 bytes of the string comprise the filename and the last 3 are the extension. If the filename is less than 8 bytes or the extension is less than 3, padding spaces are added, e.g. a file name of LOADER.RC would be encoded simply as "LOADER RC " (that's two spaces after LOADER and one after RC).

Byte 11

This Byte lists the attributes of the file. To read this you must convert the hexadecimal Byte to binary. In this case 20 (hex) is converted to 0010 0000. Each of the eight bits represents an attribute of the file. When a bit is on, indicated by a one, the file has that attribute. Starting with the right most bit, which is the zero bit and working over to the left most bit the 7th bit the attributes are; read only, hidden, system file, volume label, sub-directory, archive, and the last two bits the 6th and 7th bits indicate resolved. In this particular file it is the 5th bit that is on meaning that it is an achieve file.

Bytes (12 - 21)

These are the reserved Bytes.

Bytes (22 - 23)

These two Bytes, 4E and 7B, indicate the time the file was made. To retrieve the time reverse the Bytes to 7B 4E and convert to binary 0111 1011 0100 1110. The hour is read from the first five bits, the minutes are read from the next six bits, and the seconds are read from the last five bits. So our time Bytes are read like this 01111 011010 01110. Reading the Bytes; the hour is 15, the minutes are 26, and the seconds are 14. Important, the seconds must be multiplied by 2 to get the true second reading. So the time that the file was created was 15:26:28 military time or 3:26:28 PM.

Hour 5 bits
Minutes 6 bits
Seconds 5 bits

Bytes (24 - 25)

These two Bytes, 96 and 26, indicate the date the file was made. To retrieve the date reverse the Bytes to 26 96 and convert to binary 0010 0110 1001 0110. The year is read from the first seven bits, the month is read from the next four bits, and the day is read from the last five bits. So our date Bytes are read like this 0010011 0100 10110. Reading the Bytes; the year is 19, the month is 4, and the day is 22. The number for the year must be added with 1980 to get the correct year the file was made. So the date that the file was made was April 22, 1999.

Year 7 bits
Month 4 bits
Day 5 bits

Bytes (26 -27)

These two Bytes, 02 and 00, indicate the entry cluster value for both the FAT and the Open Space Area. More about this in the last two sections (File Allocation Table Entry Cluster Values and Location of File in the Open Space Area).

Bytes (28 - 31)

These four Bytes 1B, 0C, 00, 00 indicate the size of the file. Reversing the Bytes to 00 00 0C 1B and converting the number to decimal the size of the file is 3099 Bytes.

Finding the Beginning of the Boot, FAT, Directory, and Open Space

Boot Sector

as stated in the introduction the Boot Sector is always placed in logical sector number (LSN) 0, 0000.

File Allocation Table (FAT)

The File Allocation Table begins after the Boot Sector. To find the starting Byte, find the length of the Boot Sector which is one sector multiplied by the number of Bytes per sector (Bytes 11 and 12 of the Boot Sector). The File Allocation Table begins at 0200 (hex).

Directory

The Directory begins after both the Boot Sector and the File Allocation Tables. To find the starting Byte, find the number File Allocation Tables on the diskette (Byte 16 of the Boot Sector) and multiply this number with the number of sectors per FAT (Bytes 22 and 23 of the Boot Sector). Add this number with the number of Boot Sectors (which is one) to give you the total number of sectors of both the FAT and Boot sectors. Multiply total number of sectors by the number of Bytes per sector (Bytes 11 and 12 of the Boot Sector) giving you the starting Byte of the Directory.

(2 * 9) + 1 = 19 sectors; 19 sectors * 512 Bytes/ sector = 9728 Bytes (decimal) or 2600 Bytes (hex). The start of the Directory is 2600.

Open Space

The Open Space begins after the directory. To find the beginning of the Open Space you need to find the size of the directory in Bytes and add that to the beginning Byte of the Directory (2600). To find the size of the directory multiply the number of directory entries (Bytes 17 and 18 of the Boot Sector) by the Bytes per directory entries which in this case is given at 32 Bytes/directory entry of data (decimal).

224 directory entries * 32 Bytes/ directory entry = 7168 Bytes (decimal) or 1C00 (hex)

1C00 Bytes (hex) + 2600 Bytes (hex) = 4200 Bytes (hex). The start of the Open Space is 4200.

File Allocation Table Entry Cluster Values

What does this section attempt to describe? -- Mystran 21:18, 4 April 2007 (CDT)

Starting with the entry cluster value (Bytes 26 and 27 in the Directory), find the values (02 and 00) and reverse them to read 00 02 (hex). The result being **2 (hex or decimal).

Because, this result, the value of 2 is the same for both hexadecimal and decimal converting to decimal is not necessary, just remember that this next step is in decimal. Multiply this number 2 by 1.5 giving the number 3 (decimal) or 3 (hex). Now, go to the File Allocation table and retrieve the 3rd (0203) and 4th (0204) Bytes.¨ Remember to start your counting from zero. Take the two Bytes 03 and 40 and reverse them to 40 03. Because 3 is a whole integer, AND the binary value of the hexadecimal number of 4003 (0100 0000 0000 0011) to the binary value of the hexadecimal number 0FFF (0000 1111 1111 1111). The result is **3 (hex or decimal).

Convert the result above to decimal (if necessary) and multiply by 1.5 giving 4.5. So now we extract the 4th (0204) and 5th (0205) numbers from the File Allocation Table which are 40 and 00. Reverse the hexadecimal numbers to read 00 40. Because 4.5 is not a whole integer we right shift 0040 to read 0004. The result being **4 (hex or decimal).

Multiply 4 (decimal) by 1.5 giving 6. Now we go to the 6th (0206) and 7th (0207) number in the File Allocation Table which are 05 and 60. Reverse the numbers to read 60 05. Because 6 is a whole integer, AND the binary value of the hexadecimal number of 6005 (0110 0000 0000 0101) to the binary value of the hexadecimal number 0FFF (0000 1111 1111 1111). The result is **5 (hex or decimal).

Multiply 5 (decimal) by 1.5 giving 7.5. Now we read the 7th (0207) and 8th (0208) numbers in the File Allocation Table which are 60 and 00. Reversing the numbers we have 00 60. Because 7.5 is a fractional number we right shift 0060 to read 0006. The result is **6 (hex or decimal).

Multiply 6 (decimal) by 1.5 giving 9. Reading the 9th (0209) and 10th (0210) numbers in the File Allocation Table which are 07 and 80. Reverse the numbers to read 80 07. Because 9 is a whole integer, AND the binary value of the hexadecimal number of 8007 (1000 0000 0000 0111) to the binary value of the hexadecimal number 0FFF (0000 1111 1111 1111). The result is **7 (hex or decimal).

Take the decimal result above and multiply by 1.5 giving 10.5. Read the 10th (0210) and 11th (0211) Bytes in the File Allocation Table, the numbers are 80 and 00. Reversing the numbers we have 00 80. Because 10.5 is not a whole integer we right shift 0080 to 0008. The result is **8 (hex or decimal).

This number in decimal form is used also to calculate the Location of File in Open Space Area (Next Section).

Multiply 8 (decimal) by 1.5 giving 12. Now extract the 12th (0212) and 13th (0213) Bytes in the File Allocation Table. These numbers are FF 0F. Reverse the numbers to read 0F FF. This value 0FFF (hex) indicates the end of this file.

Location of File in Open Space Area

To find the location of the file in the Open Space Area take the decimal results of the File Allocation Table entry cluster values, as denoted by the double asterisks, and subtract 2. Then multiply by the number of Bytes per sector, which is indicated in Bytes 11 and 12 in the Boot Sector. In this case the Bytes per sector value is 512 (decimal). Finally take this value in Bytes, convert it to hexadecimal, and add it onto the starting location of the Open Space Area, which in this case is 4200.

(2 - 2) sectors * 512 Bytes per sector = 0 Bytes (decimal) 0 Bytes (hex) + 4200 Bytes = 4200 Entry value of the first cluster is 4200.

(3 - 2) sectors * 512 Bytes per sector = 512 Bytes (decimal) 200 Bytes (hex) + 4200 Bytes = 4400 Entry value of the second cluster is 4400.

(4 - 2) sectors * 512 Bytes per sector = 1024 Bytes (decimal) 400 Bytes (hex) + 4200 Bytes = 4600 Entry value of the third cluster is 4600.

(5 - 2) sectors * 512 Bytes per sector = 1536 Bytes (decimal) 600 Bytes (hex) + 4200 Bytes = 4800 Entry value of the fourth cluster is 4800.

(6 - 2) sectors * 512 Bytes per sector = 2048 Bytes (decimal) 800 Bytes + 4200 Bytes = 4A00 Entry value of fifth cluster is 4A00.

(7 - 2) sectors * 512 Bytes per sector = 2560 Bytes (decimal) A00 Bytes + 4200 Bytes = 4C00 Entry value of sixth cluster is 4C00.

(8- 2) sectors * 512 Bytes per sector = 3072 Bytes (decimal) C00 Bytes + 4200 Bytes = 4E00 Entry value of seventh cluster is 4E00.

See Also

Forum

  • from raw bits to directory listing (code posted) in the forum

External Links