Ext2: Difference between revisions

4,562 bytes added ,  13 years ago
no edit summary
[unchecked revision][unchecked revision]
mNo edit summary
No edit summary
Line 1:
{{Filesystems}}
The '''Second Extended Filesystem''' ('''ext2fs''') wasis a rewrite of the defaultoriginal ''Extended Filesystem'' and as such, is also based around the concept of "inodes." Ext2 served as the de facto filesystem of Linux priorfor nearly a decade from the early 1990s to the adventearly of2000s when it was superseded by the journaling file systems ext3fs[[Ext3|ext3]] and [[ReiserFS]]. It has native support for UNIX ownership / access rights, symbolic- and hard -links, and other properties that are nativecommon toamong UNIX.-like Likeoperating systems. HPFSOrganizationally, it triesdivides todisk minimizespace headup movementinto bygroups distributingcalled data"block across the diskgroups." Also,Having bythese using "groups", itresults minimizesin the impactdistribution of fragmentation.data Itacross isthe anotherdisk "inode"which basedhelps system.to Anminimize ext2fshead partitionmovement isas madewell upas ofthe blocks,impact whichof are normally 1K eachfragmentation. TheFurther, first blocksome (theif bootblock) is zero-ed, andnot all) thegroups otherare blocksrequired areto dividedcontain intobackups so-calledof blockimportant groupsdata (normally,that betweencan 256be andused 8192to blocksrebuild formthe afile group).system Eachin blockthe groupevent contains:of disaster.
 
'''Important Note: All values are little-endian unless otherwise specified'''
* a copy of the superblock (which is a mighty useful structure containing info about the filesystem);
* the filesystem descriptors (dunno what that is exactly)
* the block bitmap, tells which blocks are used
* the inode bitmap, tells which inodes are used (difference?)
* the inode table, which contains the inodes themselves
* the data blocks referenced by the inodes
 
 
== FileBasic Sytem StructureConcepts ==
 
=== What is a Block? ===
The Ext2 file system is split up into blocks whose size is defined in the superblock. Each block is then grouped together in block groups. Each block group contains a backup copy of the superblock, and the group descriptor table. Each block group also contains a block bitmap (a bitmap of allocated blocks within the group), an inode bitmap (a bitmap of allocated inodes within the group) and an inode Table.
The Ext2 file system divides up disk space into logical blocks of contiguous space. The size of blocks need not be the same size as the sector size of the disk the file system resides on. The size of blocks can be determined by reading the field starting at byte 24 in the [[#Superblock|Superblock]].
 
=== SuperblockWhat is a Block Group? ===
Blocks, along with inodes, are divvied up into "block groups." These are nothing more than contiguous groups of blocks.
 
Each block group reserves a few of its blocks for special purposes such as:
The superblock (which contains important information about the layout of the file system) is located at byte 1024 and is 1024 bytes in length. From the superblock, we can learn the size of each block(located at bytes 24–27), the number of inodes (located at bytes 0-3), the number of blocks (located at bytes 4-7), the number of block per group (located at bytes 32–35), the number of inodes in each group (located at bytes 40–43) and where the first block group is located. From this information we can find the group an inode belongs to and total number of groups in the filesystem.
* A bitmap of free/allocated blocks within the group
* A bitmap of allocated inodes within the group
* A table of inode structures that belong to the group
* Depending upon the revision of Ext2 used, some or all block groups may also contain a backup copy of the [[#Superblock|Superblock]] and the [[#Block_Group_Descriptor_Table|Block Group Descriptor Table]].
 
=== GroupWhat Descriptoris Tablean Inode? ===
An inode is a structure on the disk (in an inode table of a block group) that represents a file, directory, symbolic link, etc.
 
== Superblock ==
The Group Descriptor Table contains an entry for each block group within the filesystem. The table is located in the block after the superblock. Each Descriptor contains information regarding where important data structures for that group are located.
The first step in implementing an Ext2 driver is to find, extract, and parse the superblock. The Superblock contains all information about the layout of the file system and possibly contains other important information like what optional features were used to create the file system. Once you have finished with the Superblock, the next step is to look at the [[#Block_Group_Descriptor_Table|Block Group Descriptor Table]]
 
=== INodesLocating the Superblock ===
The Superblock is always located at byte 1024 from the beginning of the volume and is exactly 1024 bytes in length. For example, if the disk uses 512 byte sectors, the Superblock will begin at LBA 2 and will occupy all of sector 2 and 3.
 
=== Determining the Number of Block Groups ===
One inode is allocated to every file and directory. Each inode has an address. From a inode address, we can determine which group the inode is in, by using the formula:
From the Superblock, extract the size of each block, the total number of inodes, the total number of blocks, the number of blocks per block group, and the number of inodes in each block group. From this information we can infer the number of block groups there are by:
* Rounding up the total number of blocks divided by the number of blocks per block group
* Rounding up the total number of inodes divided by the number of inodes per block group
* Both (and check them against each other)
 
=== Base Superblock Fields ===
group = (inode – 1) / INODES_PER_GROUP
These fields are present in all versions of Ext2
 
{| {{wikitable}}
Inodes 1 to 10 are typically reserved and should be in an allocated state. The superblock has
! Starting
the value of the first non-reserved inode. Of the reserved inodes, only number 2 has a specific
Byte
function, and it is used for the root directory. Inode 1 keeps track of bad blocks, but it does
! Ending
not have any special status in the Linux kernel.
Byte
 
! Size
Each inode has a static number of fields, and additional information might be stored in
in Bytes
extended attributes and indirect block pointers. The allocation status of an inode is determined using the inode bitmap, whose location is
! Field Description
given in the group descriptor.
 
Ext2, like UFS, was designed for efficiency of small files. Therefore, each inode can store
the addresses of the first 12 blocks that a file has allocated. These are called direct pointers. If
a file needs more than 12 blocks, a block is allocated to store the remaining addresses. The
pointer to the block is called an indirect block pointer. The addresses in the block are all four
bytes, and the total number in each block is based on the block size. The indirect block
pointer is stored in the inode.
 
If a file has more blocks than can fit in the 12 direct pointers and the indirect block, a double indirect block is used. A double indirect block is when the inode points to a block that
contains a list of single indirect block pointers, each of which point to blocks that contain a
list of direct pointers. Lastly, if a file needs still more space, it can use a triple indirect block
pointer. A triple indirect block contains addresses of double indirect blocks, which contain
addresses of single indirect blocks. Each inode contains 12 direct pointers, one single
indirect pointer, one double indirect block pointer, and one triple indirect pointer.
 
An inode also contains the file's size, ownership, and temporal information. The size value in
newer versions of ExtX is 64 bits, but older versions had only 32 bits and therefore could not
handle files over 4GB. Newer versions utilize an unused field for the upper 32 bits of the size
value and set a read-only compatible feature flag when a large file exists.
 
"Ownership" information is stored using the user and group ID.
 
=== Directories ===
Directories are files which contains information needed to find files within the filesystem. The root directory is Inode 2.
 
== Implementation ==
=== Reading a file ===
 
To read a file the following step are needed:
 
# Read the superblock to find the size of each block, the number of blocks per group, number Inodes per group, the starting block of the first group.
# Read the first entry of the Group Descriptor Table, to find the location of the Inode table. get Inode 2, this will be the root directory.
# The directory information is located within the data blocks that the Inode points to, read all the data blocks associated within the Inode.
# Interate through the directory information to find the directory/file.
# Read the inode bitmap to find out weather the inode pointed to by the directory structure is allocated.
# if Allocated and is a directory go to step 3. If not allocated continue with step 4.
# Read the inode the block information, to get where the directory information is located.
# Read the first 12 blocks, (if file is less then 12 blocks, only read the number of blocks needed specified by the inode (i.e to get the number of block the file takes, divide the size of the file by the size of each block).
# If is larger than 12 blocks.
## Read indirect block.
## foreach block larger than 12 blocks, read the pointer from the block. (i.e if reading block 13, pointer 1 from the block should be read).
## if the file needs more blocks. then read the double pointer will be a pointer to a block containing pointers to blocks of data.
## if the file still need more blocks, than read the triple indirect pointer.
 
== Data structures ==
 
=== Superblock ===
{| {{Wikitable}}
|-
| 0 || 3 || 4 || Total number of inodes in file system
! Byte Range
! Description
|-
| 0–34 || 7 Number|| 4 || Total number of inodesblocks in file system
|-
| 4–78 || 11 || 4 || Number of blocks inreserved for superuser (see fileoffset system80)
|-
| 12 || 15 || 4 || Total number of unallocated blocks
| 8–11 || Number of blocks reserved to prevent file system from filling up
|-
| 12–1516 || Number19 || 4 || Total number of unallocated blocksinodes
|-
| 20 || 23 || 4 || Block number of the block containing the superblock
| 16–19 || Number of unallocated inodes
|-
| 24 || 27 || 4 || ''log''<sub>2</sub> (block size) - 10. (In other words, the number to shift 1,024 to the left by to obtain the block size)
| 20–23 || Block where block group 0 starts
|-
| 24–2728 || 31 Block|| 4 || ''log''<sub>2</sub> (fragment size) - 10. (savedIn asother words, the number of places to shift 1,024 to the left by to obtain the fragment size)
|-
| 32 || 35 || 4 || Number of blocks in each block group
| 28–31 || Fragment size (saved as the number of bits to shift 1,024 to the left)
|-
| 32–3536 || 39 || 4 || Number of blocksfragments in each block group
|-
| 36–3940 || 43 || 4 || Number of fragmentsinodes in each block group
|-
| 44 || 47 || 4 || Last mount time (in [http://en.wikipedia.org/wiki/Unix_time POSIX time])
| 40–43 || Number of inodes in each block group
|-
| 48 || 51 || 4 || Last written time (in [http://en.wikipedia.org/wiki/Unix_time POSIX time])
| 44–47 || Last mount time
|-
| 52 || 53 || 2 || Number of times the volume has been mounted since its last consistency check ([http://en.wikipedia.org/wiki/Fsck fsck])
| 48–51 || Last written time
|-
| 54 || 55 || 2 || Number of mounts allowed before a consistency check ([http://en.wikipedia.org/wiki/Fsck fsck]) must be done
| 52–53 || Current mount count
|-
| 56 || 57 || 2 || Ext2 signature (0xef53), used to help confirm the presence of Ext2 on a volume
| 54–55 || Maximum mount count
|-
| 58 || 59 || 2 || File system state ([[#File_System_States|see below]])
| 56–57 || Signature (0xef53)
|-
| 60 || 61 || 2 || What to do when an error is detected ([[#Error_Handling_Methods|see below]])
| 58–59 || File system state (see below)
|-
| 62 || 63 || 2 || Minor portion of version (combine with Major portion below to construct full version field)
| 60–61 || Error handling method (see below)
|-
| 64 || 67 || 4 || [http://en.wikipedia.org/wiki/Unix_time POSIX time] of last consistency check ([http://en.wikipedia.org/wiki/Fsck fsck])
| 62–63 || Minor version
|-
| 68 || 71 || 4 || Interval (in [http://en.wikipedia.org/wiki/Unix_time POSIX time]) between forced consistency checks ([http://en.wikipedia.org/wiki/Fsck fsck])
| 64–67 || Last consistency check time
|-
| 72 || 75 || 4 || Operating system ID from which the filesystem on this volume was created ([[#Creator_Operating_System_IDs|see below]])
| 68–71 || Interval between forced consistency checks
|-
| 76 || 79 || 4 || Major portion of version (combine with Minor portion above to construct full version field)
| 72–75 || Creator OS (see below)
|-
| 80 || 81 || 2 || User ID that can use reserved blocks
| 76–79 || Major version (see below)
|-
| 80–8182 || 83 UID|| 2 || Group ID that can use reserved blocks
|}
==== File System States ====
{| {{Wikitable}}
! Value
! State Description
|-
| 1 || File system is clean
| 82–83 || GID that can use reserved blocks
|-
| 2 || File system has errors
| 84–87 || First non-reserved inode in file system
|}
==== Error Handling Methods ====
{| {{Wikitable}}
! Value
! Action to Take
|-
| 1 || Ignore the error (continue on)
| 88–89 || Size of each inode structure
|-
| 2 || Remount file system as read-only
| 90–91 || Block group that this superblock is part of (if backup copy)
|-
| 3 || Kernel panic
| 92–95 || Compatible feature flags (see below)
|}
==== Creator Operating System IDs ====
{| {{Wikitable}}
! Value
! Operating System
|-
| 0 || [http://kernel.org/ Linux]
| 96–99 || Incompatible feature flags (see Table below)
|-
| 1 || [http://www.gnu.org/software/hurd/hurd.html GNU HURD]
| 100–103 || Read only feature flags (see Table below)
|-
| 2 || MASIX (an operating system developed by Rémy Card, one of the developers of ext2)
| 104–119 || File system ID
|-
| 3 || [http://www.freebsd.org/ FreeBSD]
| 120–135 || Volume name
|-
| 4 || Other "Lites" (BSD4.4-Lite derivatives such as [http://www.netbsd.org/ NetBSD], [http://www.openbsd.org/ OpenBSD], [http://www.opensource.apple.com/source/xnu/ XNU/Darwin], etc.)
| 136–199 || Path where last mounted on
|}
 
=== Extended Superblock Fields ===
These fields are only present if Major version (specified in the base superblock fields), is greater than 1.
{| {{Wikitable}}
! Starting
Byte
! Ending
Byte
! Size
in Bytes
! Field Description
|-
| 84 || 87 || 4 || First non-reserved inode in file system. (In versions < 1.0, this is fixed as 11)
| 200–203 || Algorithm usage bitmap
|-
| 88 || 89 || 2 || Size of each inode structure in bytes. (In versions < 1.0, this is fixed as 128)
| 204–204 || Number of blocks to preallocate for files
|-
| 90 || 91 || 2 || Block group that this superblock is part of (if backup copy)
| 205–205 || Number of blocks to preallocate for directories
|-
| 92 || 95 || 4 || Optional features present (features that are not required to read or write, but usually result in a performance increase. [[#Optional_Feature_Flags|see below]])
| 206–207 || Unused
|-
| 96 || 99 || 4 || Required features present (features that are required to be supported to read or write. [[#Required_Feature_Flags|see below]])
| 208–223 || Journal ID
|-
| 100 || 103 || 4 || Features that if not supported, the volume must be mounted read-only [[#Read-Only_Feature_Flags|see below]])
| 224–227 || Journal inode
|-
| 104 || 119 || 16 || File system ID (what is output by blkid)
| 228–231 || Journal device
|-
| 120 || 135 || 16 || Volume name (C-style string: characters terminated by a 0 byte)
| 232–235 || Head of orphan inode list
|-
| 136 || 199 || 64 || Path volume was last mounted to (C-style string: characters terminated by a 0 byte)
| 236–1023 || Unused
|}
 
==== System State Flags ====
{| {{Wikitable}}
|-
| 200 || 203 || 4 || Compression algorithms used (see Required features above)
! Flag Value
! Description
|-
| 204 || 204 || 1 || Number of blocks to preallocate for files
| 0x0001 || File system is clean
|-
| 205 || 205 || 1 || Number of blocks to preallocate for directories
| 0x0002 || File system has errors
|-
| 0x0004206 || Orphan207 inodes|| are2 being|| recovered(Unused)
|-
| 208 || 223 || 16 || Journal ID (same style as the File system ID above)
|}
==== Error-Handling Flags ====
{| {{Wikitable}}
|-
| 224 || 227 || 4 || Journal inode
! Value
! Description
|-
| 228 || 231 || 4 || Journal device
| 1 || Continue
|-
| 2232 || Remount235 file|| 4 || Head systemof asorphan readinode onlylist
|-
| 236 || 1023 || X || (Unused)
| 3 || Panic
|}
==== Optional Feature Flags ====
 
These are optional features for an implementation to support, but offer performance or reliability gains to implementations that do support them.
==== Creator OS Flags ====
{| {{Wikitable}}
|-
! Value
! Description
|-
| 0 || Linux
|-
| 1 || GNU Hurd
|-
| 2 || Masix
|-
| 3 || FreeBSD
|-
| 4 || Lites
|}
 
==== Compatible Features Flags ====
{| {{Wikitable}}
|-
Line 224 ⟶ 185:
! Description
|-
| 0x0001 || Preallocate directorysome number of (contiguous?) blocks (see byte 205 in the superblock) to a directory when creating a new one (to reduce fragmentation?)
|-
| 0x0002 || AFS server inodes exist
Line 237 ⟶ 198:
|}
 
==== IncompatibleRequired FeaturesFeature Flags ====
These features if present on a file system are required to be supported by an implementation in order to correctly read from or write to the file system.
{| {{Wikitable}}
|-
Line 243 ⟶ 205:
! Description
|-
| 0x0001 || Compression is used
|-
| 0x0002 || Directory entries contain a file type field
|-
| 0x0004 || File system needs recovery
Line 252 ⟶ 214:
|}
 
==== Read -Only Compatible FeaturesFeature Flags ====
These features, if present on a file system, are required in order for an implementation to write to the file system, but are not required to read from the file system.
{| {{Wikitable}}
|-
Line 260 ⟶ 223:
| 0x0001 || Sparse superblocks and group descriptor tables
|-
| 0x0002 || File system containsuses a large64-bit file size
|-
| 0x0004 || Directory contents are stored in the form of a [http://en.wikipedia.org/wiki/Binary_tree Binary Tree]
| 0x0004 || Directories use B-Trees
|}
 
=== Block Group Descriptor Table ===
The Group Descriptor Table contains an entry for each block group within the file system.
 
=== Locating the Block Group Descriptor Table ===
Each entry contains the following
The table is located in the block immediately following the Superblock. Each Descriptor contains information regarding where important data structures for that group are located.
 
=== Block Group Descriptor ===
{| {{Wikitable}}
! Starting
Byte
! Ending
Byte
! Size
in Bytes
! Field Description
|-
| 0 || 3 || 4 || Block address of block usage bitmap
! Byte Range
! Description
|-
| 0–34 || Starting7 block|| 4 || Block address of blockinode usage bitmap
|-
| 4–78 || 11 || 4 || Starting block address of inode bitmaptable
|-
| 12 || 13 || 2 || Number of unallocated blocks in group
| 8–11 || Starting block address of inode table
|-
| 12–1314 || 15 || 2 || Number of unallocated blocksinodes in group
|-
| 14–1516 || 17 || 2 || Number of unallocated inodesdirectories in group
|-
| 16–1718 || Number31 of|| directoriesX in|| group(Unused)
|-
| 18–31 || Unused
|}
 
=== InodeInodes ===
Like blocks, each inode has a numerical address. It is extremely important to note that unlike block addresses, '''inode addresses start at 1'''.
 
With Ext2 versions prior to Major version 1, inodes 1 to 10 are reserved and should be in an allocated state. Starting with version 1, the first non-reserved inode is indicated via a field in the Superblock. Of the reserved inodes, number 2 has subjectively has the most significance as it is used for the root directory. Inode 1 keeps track of bad blocks, but it does
not have any special status in the Linux kernel.
 
=== Determining which Block Group contains an Inode ===
From an inode address (remember that they start at 1), we can determine which group the inode is in, by using the formula:
 
group = (inode – 1) / INODES_PER_GROUP
 
=== Inode Data Structure ===
 
{| {{Wikitable}}
Line 348 ⟶ 329:
| 124–127 || Unused
|}
 
Each inode has a static number of fields, and additional information might be stored in extended attributes and indirect block pointers. The allocation status of an inode is determined using the inode bitmap, whose location is given in the group descriptor.
 
Ext2, like UFS, was designed for efficiency of small files. Therefore, each inode can store the addresses of the first 12 blocks that a file has allocated. These are called direct pointers. If
a file needs more than 12 blocks, a block is allocated to store the remaining addresses. The pointer to the block is called an indirect block pointer. The addresses in the block are all four
bytes, and the total number in each block is based on the block size. The indirect block pointer is stored in the inode.
 
If a file has more blocks than can fit in the 12 direct pointers and the indirect block, a double indirect block is used. A double indirect block is when the inode points to a block that
contains a list of single indirect block pointers, each of which point to blocks that contain a
list of direct pointers. Lastly, if a file needs still more space, it can use a triple indirect block
pointer. A triple indirect block contains addresses of double indirect blocks, which contain
addresses of single indirect blocks. Each inode contains 12 direct pointers, one single
indirect pointer, one double indirect block pointer, and one triple indirect pointer.
 
An inode also contains the file's size, ownership, and temporal information. The size value in
newer versions of ExtX is 64 bits, but older versions had only 32 bits and therefore could not
handle files over 4GB. Newer versions utilize an unused field for the upper 32 bits of the size
value and set a read-only compatible feature flag when a large file exists.
 
"Ownership" information is stored using the user and group ID.
 
=== Directories ===
Directories are files which contains information needed to find files within the filesystem. The root directory is Inode 2.
 
 
 
 
 
==== Filemode flags ====
Line 485 ⟶ 493:
|EXT2_FT_MAX ||8
|}
 
== Putting it all together ==
=== How To Read A File ===
# Read the superblock to find the size of each block, the number of blocks per group, number Inodes per group, the starting block of the first group.
# Read the first entry of the Group Descriptor Table, to find the location of the Inode table. get Inode 2, this will be the root directory.
# The directory information is located within the data blocks that the Inode points to, read all the data blocks associated within the Inode.
# Interate through the directory information to find the directory/file.
# Read the inode bitmap to find out weather the inode pointed to by the directory structure is allocated.
# if Allocated and is a directory go to step 3. If not allocated continue with step 4.
# Read the inode the block information, to get where the directory information is located.
# Read the first 12 blocks, (if file is less then 12 blocks, only read the number of blocks needed specified by the inode (i.e to get the number of block the file takes, divide the size of the file by the size of each block).
# If is larger than 12 blocks.
## Read indirect block.
## foreach block larger than 12 blocks, read the pointer from the block. (i.e if reading block 13, pointer 1 from the block should be read).
## if the file needs more blocks. then read the double pointer will be a pointer to a block containing pointers to blocks of data.
## if the file still need more blocks, than read the triple indirect pointer.
 
== Links ==
Anonymous user