Ext2: Difference between revisions

6,216 bytes added ,  1 year ago
→‎Base Superblock Fields: clarify the superblock block number field is also the starting block number
[unchecked revision][unchecked revision]
(→‎Base Superblock Fields: clarify the superblock block number field is also the starting block number)
 
(23 intermediate revisions by 9 users not shown)
Line 2:
The '''Second Extended Filesystem''' ('''ext2fs''') is a rewrite of the original ''Extended Filesystem'' and as such, is also based around the concept of "inodes." Ext2 served as the de facto filesystem of Linux for nearly a decade from the early 1990s to the early 2000s when it was superseded by the journaling file systems [[Ext3|ext3]] and [[ReiserFS]]. It has native support for UNIX ownership / access rights, symbolic- and hard-links, and other properties that are common among UNIX-like operating systems. Organizationally, it divides disk space up into groups called "block groups." Having these groups results in distribution of data across the disk which helps to minimize head movement as well as the impact of fragmentation. Further, some (if not all) groups are required to contain backups of important data that can be used to rebuild the file system in the event of disaster.
 
''Note: Most of the information here is based off of work done by Dave Poirier on the ext2-doc project (see the [[#Links|links section]]) which is graciously released under the [http://www.fsf.org/licenses/fdl.html GNU Free Documentation License]. Be sure to buy him a beer the next time you see him.''
'''Important Note: All values are little-endian unless otherwise specified'''
 
 
== Basic Concepts ==
'''Important Note: All values are little-endian unless otherwise specified'''
 
=== What is a Block? ===
Line 11:
 
=== What is a Block Group? ===
Blocks, along with inodes, are divvieddivided up into "block groups." These are nothing more than contiguous groups of blocks.
 
Each block group reserves a few of its blocks for special purposes such as:
Line 20:
 
=== What is an Inode? ===
An inode is a structure on the disk that represents a file, directory, symbolic link, etc. Inodes residedo innot contain the data of the file / directory / etc. that they represent. Instead, they link to the blocks that actually contain the data. This lets the inodes themselves have a particularwell-defined Blocksize Group'swhich inodelets tablethem ([[#Determining_which_Block_Group_contains_an_Inode|seebe below]])placed in easily indexed arrays. Each block group has an array of inodes it is responsible for, and areconversely separateevery frominode thewithin actuala contentsfile datasystem belongs to one of such tables (and one of such block groups).
 
== Superblock ==
Line 55:
| 16 || 19 || 4 || Total number of unallocated inodes
|-
| 20 || 23 || 4 || Block number of the block containing the superblock (also the starting block number, NOT always zero.)
|-
| 24 || 27 || 4 || ''log''<sub>2</sub> (block size) - 10. (In other words, the number to shift 1,024 to the left by to obtain the block size)
Line 132:
 
=== Extended Superblock Fields ===
These fields are only present if Major version (specified in the base superblock fields), is greater than or equal to 1.
{| {{Wikitable}}
! Starting
Line 209:
| 0x0002 || Directory entries contain a type field
|-
| 0x0004 || File system needs recoveryto replay its journal
|-
| 0x0008 || File system uses a journal device
Line 263:
Like blocks, each inode has a numerical address. It is extremely important to note that unlike block addresses, '''inode addresses start at 1'''.
 
With Ext2 versions prior to Major version 1, inodes 1 to 10 are reserved and should be in an allocated state. Starting with version 1, the first non-reserved inode is indicated via a field in the Superblock. Of the reserved inodes, number 2 has subjectively has the most significance as it is used for the root directory. Inode 1 keeps track of bad blocks, but it does
 
not have any special status in the Linux kernel.
Inodes have a fixed size of either 128 for version 0 Ext2 file systems, or as dictated by the field in the Superblock for version 1 file systems. All inodes reside in inode tables that belong to block groups. Therefore, looking up an inode is simply a matter of determining which block group it belongs to and indexing that block group's inode table.
 
=== Determining which Block Group contains an Inode ===
Line 278 ⟶ 279:
index = (inode – 1) % INODES_PER_GROUP
 
where % denotes the [http://en.wikipedia.org/wiki/Modulo_operation Modulo operation] and INODES_PER_GROUP is a field in the Superblock (the same field which was used to determine which block group the inode belongs to).
 
Next, we have to determine which block contains our inode. This is achieved from:
 
containing block = (index * INODE_SIZE) / BLOCK_SIZE
 
where INODE_SIZE is either fixed at 128 if VERSION < 1 or defined by a field in the Superblock if VERSION >= 1.0, and BLOCK_SIZE is defined by a field in the Superblock.
 
Finally, mask and shift as necessary to extract only the inode data from the containing block.
 
=== Reading the contents of an inode ===
Each inode contains 12 direct pointers, one singly indirect pointer, one doubly indirect block pointer, and one triply indirect pointer. The direct space "overflows" into the singly indirect space, which overflows into the doubly indirect space, which overflows into the triply indirect space.
 
'''Direct Block Pointers''': There are 12 direct block pointers. If valid, the value is non-zero. Each pointer is the block address of a block containing data for this inode.
 
'''Singly Indirect Block Pointer''': If a file needs more than 12 blocks, a separate block is allocated to store the block addresses of the remaining data blocks needed to store its contents. This separate block is called an indirect block because it adds an extra step (a level of indirection) between an inode and its data. The block addresses stored in the block are all 32-bit, and the capacity of stored addresses in this block is a function of the block size. The address of this indirect block is stored in the inode in the "Singly Indirect Block Pointer" field.
 
'''Doubly Indirect Block Pointer''': If a file has more blocks than can fit in the 12 direct pointers and the indirect block, a double indirect block is used. A double indirect block is an extension of the indirect block described above only now we have two intermediate blocks between the inode and data blocks. The inode structure has a "Doubly Indirect Block Pointer" field that points to this block if necessary.
 
'''Triply Indirect Block Pointer''': Lastly, if a file needs still more space, it can use a triple indirect block. Again, this is an extension of the double indirect block. So, a triple indirect block contains addresses of double indirect blocks, which contain addresses of single indirect blocks, which contain address of data blocks. The inode structure has a "Triply Indirect Block Pointer" field that points to this block if present.
 
[http://en.wikipedia.org/wiki/File:Ext2-inode.gif This image from Wikipedia] illustrates what is described above pretty well.
 
=== Inode Data Structure ===
 
{| {{Wikitable}}
! Starting
Byte
! Ending
Byte
! Size
in Bytes
! Field Description
|-
| 0 || 1 || 2 || Type and Permissions ([[#Inode_Type_and_Permissions|see below]])
! Byte Range
! Description
|-
| 0–12 || File3 mode|| (type2 || andUser permissions)ID
|-
| 2–34 || 7 || 4 || Lower 1632 bits of usersize in IDbytes
|-
| 8 || 11 || 4 || Last Access Time (in [http://en.wikipedia.org/wiki/Unix_time POSIX time])
| 4–7 || Lower 32 bits of size in bytes
|-
| 12 || 15 || 4 || Creation Time (in [http://en.wikipedia.org/wiki/Unix_time POSIX time])
| 8–11 || Access Time
|-
| 16 || 19 || 4 || Last Modification time (in [http://en.wikipedia.org/wiki/Unix_time POSIX time])
| 12–15 || Change Time
|-
| 20 || 23 || 4 || Deletion time (in [http://en.wikipedia.org/wiki/Unix_time POSIX time])
| 16–19 || Modification time
|-
| 24 || 25 || 2 || Group ID
| 20–23 || Deletion time
|-
| 26 || 27 || 2 || Count of hard links (directory entries) to this inode. When this reaches 0, the data blocks are marked as unallocated.
| 24–25 || Lower 16 bits of group ID
|-
| 28 || 31 || 4 || Count of disk sectors (not Ext2 blocks) in use by this inode, not counting the actual inode structure nor directory entries linking to the inode.
| 26–27 || Link count
|-
| 32 || 35 || 4 || Flags ([[#Inode_Flags|see below]])
| 28–31 || Sector count
|-
| 36 || 39 || 4 || [[#OS_Specific_Value_1|Operating System Specific value #1]]
| 32–35 || Flags
|-
| 40 || 43 || 4 || Direct Block Pointer 0
| 36–39 || Unused
|-
| 44 || 47 || 4 || Direct Block Pointer 1
| 40–87 || 12 direct block pointers
|-
| 48 || 51 || 4 || Direct Block Pointer 2
| 88–91 || 1 single indirect block pointer
|-
| 52 || 55 || 4 || Direct Block Pointer 3
| 92–95 || 1 double indirect block pointer
|-
| 56 || 59 || 4 || Direct Block Pointer 4
| 96–99 || 1 triple indirect block pointer
|-
| 60 || 63 || 4 || Direct Block Pointer 5
| 100–103 || Generation number (NFS)
|-
| 64 || 67 || 4 || Direct Block Pointer 6
| 104–107 || Extended attribute block (File ACL)
|-
| 108–11168 || Upper71 32|| bits4 of size /|| DirectoryDirect ACLBlock YesPointer /7
|-
| 72 || 75 || 4 || Direct Block Pointer 8
| 112–115 || Block address of fragment
|-
| 76 || 79 || 4 || Direct Block Pointer 9
| 116–116 || Fragment index in block
|-
| 80 || 83 || 4 || Direct Block Pointer 10
| 117–117 || Fragment size
|-
| 84 || 87 || 4 || Direct Block Pointer 11
| 118–119 || Unused
|-
| 88 || 91 || 4 || Singly Indirect Block Pointer (Points to a block that is a list of block pointers to data)
| 120–121 || Upper 16 bits of user ID
|-
| 92 || 95 || 4 || Doubly Indirect Block Pointer (Points to a block that is a list of block pointers to Singly Indirect Blocks)
| 122–123 || Upper 16 bits of group ID
|-
| 96 || 99 || 4 || Triply Indirect Block Pointer (Points to a block that is a list of block pointers to Doubly Indirect Blocks)
| 124–127 || Unused
|}
 
Each inode has a static number of fields, and additional information might be stored in extended attributes and indirect block pointers. The allocation status of an inode is determined using the inode bitmap, whose location is given in the group descriptor.
 
Ext2, like UFS, was designed for efficiency of small files. Therefore, each inode can store the addresses of the first 12 blocks that a file has allocated. These are called direct pointers. If
a file needs more than 12 blocks, a block is allocated to store the remaining addresses. The pointer to the block is called an indirect block pointer. The addresses in the block are all four
bytes, and the total number in each block is based on the block size. The indirect block pointer is stored in the inode.
 
If a file has more blocks than can fit in the 12 direct pointers and the indirect block, a double indirect block is used. A double indirect block is when the inode points to a block that
contains a list of single indirect block pointers, each of which point to blocks that contain a
list of direct pointers. Lastly, if a file needs still more space, it can use a triple indirect block
pointer. A triple indirect block contains addresses of double indirect blocks, which contain
addresses of single indirect blocks. Each inode contains 12 direct pointers, one single
indirect pointer, one double indirect block pointer, and one triple indirect pointer.
 
An inode also contains the file's size, ownership, and temporal information. The size value in
newer versions of ExtX is 64 bits, but older versions had only 32 bits and therefore could not
handle files over 4GB. Newer versions utilize an unused field for the upper 32 bits of the size
value and set a read-only compatible feature flag when a large file exists.
 
"Ownership" information is stored using the user and group ID.
 
=== Directories ===
Directories are files which contains information needed to find files within the filesystem. The root directory is Inode 2.
 
 
 
 
 
==== Filemode flags ====
{| {{Wikitable}}
|+ Bits 0-8
|-
| 100 || 103 || 4 || Generation number (Primarily used for NFS)
! Permission Flag
! In Octal
! Description
|-
| 104 || 107 || 4 || In Ext2 version 0, this field is reserved. In version >= 1, Extended attribute block (File ACL).
| 0x001 || 0001 || Other—execute permission
|-
| 108 || 111 || 4 || In Ext2 version 0, this field is reserved. In version >= 1, Upper 32 bits of file size (if feature bit set) if it's a file, Directory ACL if it's a directory
| 0x002 || 0002 || Other—write permission
|-
| 112 || 115 || 4 || Block address of fragment
| 0x004 || 0004 || Other—read permission
|-
| 116 || 127 || 12 || [[#OS_Specific_Value_2|Operating System Specific Value #2]]
| 0x008 || 0010 || Group—execute permission
|}
 
==== Inode Type and Permissions ====
{| {{Wikitable}}
|+The type indicator occupies the top hex digit (bits 15 to 12) of this 16-bit field
! Type value
in hex
! Type Description
|-
| 0x1000 || FIFO
| 0x010 || 0020 || Group—write permission
|-
| 0x2000 || Character device
| 0x020 || 0040 || Group—read permission
|-
| 0x4000 || Directory
| 0x040 || 0100 || User—execute permission
|-
| 0x6000 || Block device
| 0x080 || 0200 || User—write permission
|-
| 0x8000 || Regular file
|-
| 0xA000 || Symbolic link
| 0x100 || 0400 || User—read permission
|-
| 0xC000 || Unix socket
|}
 
{| {{Wikitable}}
|+Permissions occupy the bottom 12 bits of this 16-bit field
|+ Bits 9-11
! Permission
value in hex
! Permission
value in octal
! Permission Description
|-
| 0x001 || 00001 || [http://en.wikipedia.org/wiki/Filesystem_permissions#Traditional_Unix_permissions Other—execute permission]
! Flag Value
! In Octal
! Description
|-
| 0x002 || 00002 || [http://en.wikipedia.org/wiki/Filesystem_permissions#Traditional_Unix_permissions Other—write permission]
| 0x200 || 01000 || Sticky bit
|-
| 0x004 || 00004 || [http://en.wikipedia.org/wiki/Filesystem_permissions#Traditional_Unix_permissions Other—read permission]
| 0x400 || 02000 || Set group ID
|-
| 0x008 || 00010 || [http://en.wikipedia.org/wiki/Filesystem_permissions#Traditional_Unix_permissions Group—execute permission]
| 0x800 || 04000 || Set user ID
|}
 
Bits 12-15
 
{| {{Wikitable}}
|-
| 0x010 || 00020 || [http://en.wikipedia.org/wiki/Filesystem_permissions#Traditional_Unix_permissions Group—write permission]
! Type Value
! Description
|-
| 0x020 || 00040 || [http://en.wikipedia.org/wiki/Filesystem_permissions#Traditional_Unix_permissions Group—read permission]
| 0x1000 || FIFO
|-
| 0x040 || 00100 || [http://en.wikipedia.org/wiki/Filesystem_permissions#Traditional_Unix_permissions User—execute permission]
| 0x2000 || Character device
|-
| 0x080 || 00200 || [http://en.wikipedia.org/wiki/Filesystem_permissions#Traditional_Unix_permissions User—write permission]
| 0x4000 || Directory
|-
| 0x100 || 00400 || [http://en.wikipedia.org/wiki/Filesystem_permissions#Traditional_Unix_permissions User—read permission]
| 0x6000 || Block device
|-
| 0x200 || 01000 || [http://en.wikipedia.org/wiki/Sticky_bit Sticky Bit]
| 0x8000 || Regular file
|-
| 0x400 || 02000 || Set group ID
| 0xA000 || Symbolic link
|-
| 0xC0000x800 || Unix04000 socket|| Set user ID
|}
 
==== Inode flagsFlags ====
{| {{Wikitable}}
|-
Line 450 ⟶ 452:
| 0x00000040 || File is not included in 'dump' command
|-
| 0x00000080 || A-Last accessed time isshould not updated
|-
| ... || (Reserved)
| 0x00001000 || Hash indexed directory
|-
| 0x00010000 || Hash indexed directory
| 0x00002000 || File data is journaled with Ext3
|-
| 0x00020000 || AFS directory
|-
| 0x00040000 || Journal file data
|}
==== OS Specific Value 1 ====
{| {{Wikitable}}
! Operating
System
! How they use this field
|-
| Linux || (reserved)
|-
| HURD || "translator"?
|-
| MASIX || (reserved)
|}
==== OS Specific Value 2 ====
{| {{Wikitable}}
! Operating
System
! How they use this field
|-
| Linux ||
{| {{Wikitable}}
! Starting
Byte
! Ending
Byte
! Size
in Bytes
! Field Description
|-
| 116 || 116 || 1 || Fragment number
|-
| 117 || 117 || 1 || Fragment size
|-
| 118 || 119 || 2 || (reserved)
|-
| 120 || 121 || 2 || High 16 bits of 32-bit User ID
|-
| 122 || 123 || 2 || High 16 bits of 32-bit Group ID
|-
| 124 || 127 || 4 || (reserved)
|}
|-
| HURD ||
{| {{Wikitable}}
! Starting
Byte
! Ending
Byte
! Size
in Bytes
! Field Description
|-
| 116 || 116 || 1 || Fragment number
|-
| 117 || 117 || 1 || Fragment size
|-
| 118 || 119 || 2 || High 16 bits of 32-bit "Type and Permissions" field
|-
| 120 || 121 || 2 || High 16 bits of 32-bit User ID
|-
| 122 || 123 || 2 || High 16 bits of 32-bit Group ID
|-
| 124 || 127 || 4 || User ID of author (if == 0xFFFFFFFF, the normal User ID will be used)
|}
|-
| MASIX ||
{| {{Wikitable}}
! Starting
Byte
! Ending
Byte
! Size
in Bytes
! Field Description
|-
| 116 || 116 || 1 || Fragment number
|-
| 117 || 117 || 1 || Fragment size
|-
| 118 || 127 || X || (reserved)
|}
|}
 
=== Directories ===
Directories are inodes which contain some number of "entries" as their contents. These entries are nothing more than a name/inode pair. For instance the inode corresponding to the root directory might have an entry with the name of "etc" and an inode value of 50. A directory inode stores these entries in a linked-list fashion in its contents blocks.
 
The root directory is Inode 2.
=== Directory Information ===
 
The total size of a directory entry may be longer then the length of the name would imply (The name may not span to the end of the record), and records have to be aligned to 4-byte boundaries. Directory entries are also not allowed to span multiple blocks on the file-system, so there may be empty space in-between directory entries. Empty space is however not allowed in-between directory entries, so any possible empty space will be used as part of the preceding record by increasing its record length to include the empty space. Empty space may also be equivalently marked by a separate directory entry with an inode number of zero, indicating that directory entry should be skipped.
 
=== Directory Entry ===
 
{| {{Wikitable}}
! Starting
Byte
! Ending
Byte
! Size
in Bytes
! Field Description
|-
| 0 || 3 || 4 || Inode
! Byte Range
! Description
|-
| 4 || 5 || 2 || Total size of this entry (Including all subfields)
| 0–3 || Inode
|-
| 6 || 6 || 1 || Name Length least-significant 8 bits
| 4-5 || Record Length (Will get you to the next record)
|-
| 7 || 7 || 1 || [[#Directory_Entry_Type_Indicators|Type indicator]] (only if the feature bit for "directory entries have file type byte" is set, else this is the most-significant 8 bits of the Name Length)
| 6 || Name Length
|-
| 7 || File Type
|-
| 8-X || File name data
|-
| 8 || 8+N-1 || N || Name characters
|}
 
 
==== FiletypesDirectory Entry Type Indicators ====
 
{| {{Wikitable}}
|-
! Description
! Value
! Type Description
|-
| 0 || Unknown type
|EXT2_FT_UNKNOWN || 0
|-
| 1 || Regular file
|EXT2_FT_REG_FILE || 1
|-
| 2 || Directory
|EXT2_FT_DIR || 2
|-
| 3 || Character device
|EXT2_FT_CHRDEV ||3
|-
| 4 || Block device
|EXT2_FT_BLKDEV ||4
|-
|EXT2_FT_FIFO 5 ||5 FIFO
|-
| 6 || Socket
|EXT2_FT_SOCK ||6
|-
| 7 || Symbolic link (soft link)
|EXT2_FT_SYMLINK ||7
|-
|EXT2_FT_MAX ||8
|}
 
== PuttingQuick it all togetherSummaries ==
=== How To Read AAn FileInode ===
# Read the superblockSuperblock to find the size of each block, the number of blocks per group, number Inodes per group, and the starting block of the first group (Block Group Descriptor Table).
# Determine which block group the inode belongs to.
# Read the first entry of the Group Descriptor Table, to find the location of the Inode table. get Inode 2, this will be the root directory.
# Read the Block Group Descriptor corresponding to the Block Group which contains the inode to be looked up.
# The directory information is located within the data blocks that the Inode points to, read all the data blocks associated within the Inode.
# From the Block Group Descriptor, extract the location of the block group's inode table.
# Interate through the directory information to find the directory/file.
# ReadDetermine the inodeindex bitmap to find out weatherof the inode pointed to byin the directory structure isinode allocatedtable.
# Index the inode table (taking into account non-standard inode size).
# if Allocated and is a directory go to step 3. If not allocated continue with step 4.
 
# Read the inode the block information, to get where the directory information is located.
Directory entry information and file contents are located within the data blocks that the Inode points to.
# Read the first 12 blocks, (if file is less then 12 blocks, only read the number of blocks needed specified by the inode (i.e to get the number of block the file takes, divide the size of the file by the size of each block).
 
# If is larger than 12 blocks.
=== How To Read the Root Directory ===
## Read indirect block.
The root directory's inode is defined to always be 2. Read/parse the contents of inode 2.
## foreach block larger than 12 blocks, read the pointer from the block. (i.e if reading block 13, pointer 1 from the block should be read).
## if the file needs more blocks. then read the double pointer will be a pointer to a block containing pointers to blocks of data.
## if the file still need more blocks, than read the triple indirect pointer.
 
==See Links Also==
===External Links===
* [http://www.nongnu.org/ext2-doc/ ext2-doc project: Second Extended File System] - implementation-oriented documentation, describes internal structure in human language.
* [http://web.mit.edu/tytso/www/linux/ext2intro.html Design and Implementation of the Second Extended Filesystem] (overview)
* [http://ext2.sourceforge.net/2005-ols/paper-html/ State of the Art: Where we are with the Ext3 filesystem] - Paper by Mingming Cao, Theodore Y. Ts'o, Badari Pulavarty, and Suparna Bhattacharya describing extended features for ext2
 
[[Category:Filesystems]]
[[de:Ext2]]
Anonymous user