Ext2: Difference between revisions

10,523 bytes added ,  16 years ago
no edit summary
[unchecked revision][unchecked revision]
m (corrected URL (had a typo and the site moved))
No edit summary
Line 1:
Ext2/ext3 is used by many distributions of Linux. Ext2/3 is base on the Unix File System (UFS), ExtX has removed components that UFS no longer needed.
== About ext2fs (Second Extended Filesystem) ==
 
The Ext2 file system is split up into blocks, each block is then grouped together into block groups. Each block group contains a backup copy of the superblock, and group descriptor table.
The Second Extended Filesystem (ext2fs) was the default filesystem of Linux prior the advent of the journaling file systems ext3fs and ReiserFS. It has native support for UNIX ownership / access rights, symbolic and hard links and other Unix-native properties. Like HPFS, it tries to minimize head movement by distributing data across the disk. Also, by using "groups", it minimizes the impact of fragmentation. It is another "inode" based system. An ext2fs-partition is made up from blocks, which normally are 1K each. The first block (the bootblock) is zeroized, all the other blocks are divided into so-called block groups (normally, between 256 and 8192 blocks form a group). Each block group contains:
 
The superblock (which contains important information about the layout of the file system) is located at byte 1024 and is 1024 bytes in length
* a copy of the superblock (which is a mighty useful structure containing info about the filesystem);
* the filesystem descriptors (dunno what that is exactly)
* the block bitmap, tells which blocks are used
* the inode bitmap, tells which inodes are used (difference?)
* the inode table, which contains the inodes themselves
* the data blocks referenced by the inodes
 
== File Sytem Structure ==
The first inode is a special one; it is the bad blocks inode, which references all the damaged sectors of the partition. The fifth inode contains the bootloader, whereas the 11th contains the root directory.
 
The Ext2 file system is split up into blocks size defined in the superblock. each block is then grouped together into block groups. Each block group contains a backup copy of the superblock, and group descriptor table. Each block group also contains a block bitmap (bitmap of allocated blocks within the group) inode bitmap (bitmap of allocated inodes within the group) and an Inode Table.
Windows users can access ext2fs partitions with [http://www.chrysocome.net/explore2fs explore2fs].
 
=== Superblock ===
== About ext3fs (Third Extended File System) ==
 
The superblock (which contains important information about the layout of the file system) is located at byte 1024 and is 1024 bytes in length. From the superblock we can learn, the size of each block(bytes 24–27), Number of inodes (bytes 0-3), number of blocks (bytes 4-7), number of block per group (bytes 32–35), number of inodes in each group (bytes 40–43) and where the first block group is located. From this information we can find the group an inode belongs to, total number of groups in the filesystem.
ext3fs is basically ext2fs with journaling added. If your ext3fs partition does not need journal replay, it can even be accessed with a 'simple' ext2fs driver.
 
=== Group Descriptor Table ===
 
The Group Descriptor Table contains an entry for each block group within the filesystem. The table is located in the next block after the superblock. Each Descriptor contains information regarding whre inportant datastructure are for that group.
 
=== INodes ===
 
One inode is allocated to every file and directory, and each inode has an address. From a inode address, we can determine which group the inode is in, by using the following formula.
 
group = (inode – 1) / INODES_PER_GROUP
 
Inodes 1 to 10 are typically reserved and should be in an allocated state. The superblock has
the value of the first non-reserved inode. Of the reserved inodes, only number 2 has a specific
function, and it is used for the root directory. Inode 1 keeps track of bad blocks, but it does
not have any special status in the Linux kernel.
 
Each inode has a static number of fields, and additional information might be stored in
extended attributes and indirect block pointers. The allocation status of an inode is determined using the inode bitmap, whose location is
given in the group descriptor.
 
Ext2, like UFS, was designed for efficiency of small files. Therefore, each inode can store
the addresses of the first 12 blocks that a file has allocated. These are called direct pointers. If
a file needs more than 12 blocks, a block is allocated to store the remaining addresses. The
pointer to the block is called an indirect block pointer. The addresses in the block are all four
bytes, and the total number in each block is based on the block size. The indirect block
pointer is stored in the inode.
 
If a file has more blocks than can fit in the 12 direct pointers and the indirect block, a double indirect block is used. A double indirect block is when the inode points to a block that
contains a list of single indirect block pointers, each of which point to blocks that contain a
list of direct pointers. Lastly, if a file needs still more space, it can use a triple indirect block
pointer. A triple indirect block contains addresses of double indirect blocks, which contain
addresses of single indirect blocks. Each inode contains 12 direct pointers, one single
indirect pointer, one double indirect block pointer, and one triple indirect pointer.
 
An inode also contains the file's size, ownership, and temporal information. The size value in
newer versions of ExtX is 64 bits, but older versions had only 32 bits and therefore could not
handle files over 4GB. Newer versions utilize an unused field for the upper 32 bits of the size
value and set a read-only compatible feature flag when a large file exists.
 
"Ownership" information is stored using the user and group ID.
 
=== Directories ===
Directories are files which contains information needed to find files within the filesystem. The root directory is Inode 2.
 
== Implementation ==
=== Reading a file ===
 
To read a file the following step are needed:
 
# Read the superblock to find the size of each block, the number of blocks per group, number Inodes per group, the starting block of the first group.
# Read the first entry of the Group Descriptor Table, to find the location of the Inode table. get Inode 2, this will be the root directory.
# Read the inode the block information, to get where the directory information is located.
# Interate through the directory information to find the directory/file.
# Read the inode bitmap to find out weather the inode pointed to by the directory structure is allocated.
# if Allocated and is a directory go to step 3. If not allocated continue with step 4.
# Read the inode the block information, to get where the directory information is located.
# Read the first 12 blocks, (if file is less then 12 blocks, only read the number of blocks needed specified by the inode (i.e to get the number of block the file takes, divide the size of the file by the size of each block).
# If is larger than 12 blocks.
## Read indirect block.
## foreach block larger than 12 blocks, read the pointer from the block. (i.e if the reading block 13, pointer 1 from the block should be read).
## if the file needs more blocks. then read the double pointer will be a pointer to a block containing pointers to blocks of data.
## if the file still need more blocks, than read the triple indirect pointer.
 
== Data structures ==
 
=== Superblock ===
{| class="wikitable"
|-
! Byte Range
! Description
|-
| 0–3 || Number of inodes in file system
|-
| 4–7 || Number of blocks in file system
|-
| 8–11 || Number of blocks reserved to prevent file system from filling up
|-
| 12–15 || Number of unallocated blocks
|-
| 16–19 || Number of unallocated inodes
|-
| 20–23 || Block where block group 0 starts
|-
| 24–27 || Block size (saved as the number of places to shift 1,024 to the left)
|-
| 28–31 || Fragment size (saved as the number of bits to shift 1,024 to the left)
|-
| 32–35 || Number of blocks in each block group
|-
| 36–39 || Number of fragments in each block group
|-
| 40–43 || Number of inodes in each block group
|-
| 44–47 || Last mount time
|-
| 48–51 || Last written time
|-
| 52–53 || Current mount count
|-
| 54–55 || Maximum mount count
|-
| 56–57 || Signature (0xef53)
|-
| 58–59 || File system state (see below)
|-
| 60–61 || Error handling method (see below)
|-
| 62–63 || Minor version
|-
| 64–67 || Last consistency check time
|-
| 68–71 || Interval between forced consistency checks
|-
| 72–75 || Creator OS (see below)
|-
| 76–79 || Major version (see below)
|-
| 80–81 || UID that can use reserved blocks
|-
| 82–83 || GID that can use reserved blocks
|-
| 84–87 || First non-reserved inode in file system
|-
| 88–89 || Size of each inode structure
|-
| 90–91 || Block group that this superblock is part of (if backup copy)
|-
| 92–95 || Compatible feature flags (see below)
|-
| 96–99 || Incompatible feature flags (see Table below)
|-
| 100–103 || Read only feature flags (see Table below)
|-
| 104–119 || File system ID
|-
| 120–135 || Volume name
|-
| 136–199 || Path where last mounted on
|-
| 200–203 || Algorithm usage bitmap
|-
| 204–204 || Number of blocks to preallocate for files
|-
| 205–205 || Number of blocks to preallocate for directories
|-
| 206–207 || Unused
|-
| 208–223 || Journal ID
|-
| 224–227 || Journal inode
|-
| 228–231 || Journal device
|-
| 232–235 || Head of orphan inode list
|-
| 236–1023 || Unused
|}
 
==== System State Flags ====
{|
|-
! Flag Value
! Description
|-
| 0x0001 || File system is clean
|-
| 0x0002 || File system has errors
|-
| 0x0004 || Orphan inodes are being recovered
|-
|}
==== Error-Handling Flags ====
{|
|-
! Value
! Description
|-
| 1 || Continue
|-
| 2 || Remount file system as read only
|-
| 3 || Panic
|}
 
==== Creator OS Flags ====
{|
|-
! Value
! Description
|-
| 0 || Linux
|-
| 1 || GNU Hurd
|-
| 2 || Masix
|-
| 3 || FreeBSD
|-
| 4 || Lites
|}
 
==== Compatible Features Flags ====
{|
|-
! Flag Value
! Description
|-
| 0x0001 || Preallocate directory blocks to reduce fragmentation
|-
| 0x0002 || AFS server inodes exist
|-
| 0x0004 || File system has a journal (Ext3)
|-
| 0x0008 || Inodes have extended attributes
|-
| 0x0010 || File system can resize itself for larger partitions
|-
| 0x0020 || Directories use hash index
|}
 
==== Incompatible Features Flags ====
{|
|-
! Flag Value
! Description
|-
| 0x0001 || Compression
|-
| 0x0002 || Directory entries contain a file type field
|-
| 0x0004 || File system needs recovery
|-
| 0x0008 || File system uses a journal device
|}
 
==== Read Only Compatible Features Flags ====
{|
|-
! Flag Value
! Description
|-
| 0x0001 || Sparse superblocks and group descriptor tables
|-
| 0x0002 || File system contains a large file
|-
| 0x0004 || Directories use B-Trees
|}
 
=== Group Descriptor Table ===
 
Each entry contains the following
 
{|
|-
! Byte Range
! Description
|-
| 0–3 || Starting block address of block bitmap
|-
| 4–7 || Starting block address of inode bitmap
|-
| 8–11 || Starting block address of inode table
|-
| 12–13 || Number of unallocated blocks in group
|-
| 14–15 || Number of unallocated inodes in group
|-
| 16–17 || Number of directories in group
|-
| 18–31 || Unused
|}
 
=== Inode ===
 
{|
|-
! Byte Range
! Description
|-
| 0–1 || File mode (type and permissions)
|-
| 2–3 || Lower 16 bits of user ID
|-
| 4–7 || Lower 32 bits of size in bytes
|-
| 8–11 || Access Time
|-
| 12–15 || Change Time
|-
| 16–19 || Modification time
|-
| 20–23 || Deletion time
|-
| 24–25 || Lower 16 bits of group ID
|-
| 26–27 || Link count
|-
| 28–31 || Sector count
|-
| 32–35 || Flags
|-
| 36–39 || Unused
|-
| 40–87 || 12 direct block pointers
|-
| 88–91 || 1 single indirect block pointer
|-
| 92–95 || 1 double indirect block pointer
|-
| 96–99 || 1 triple indirect block pointer
|-
| 100–103 || Generation number (NFS)
|-
| 104–107 || Extended attribute block (File ACL)
|-
| 108–111 || Upper 32 bits of size / Directory ACL Yes /
|-
| 112–115 || Block address of fragment
|-
| 116–116 || Fragment index in block
|-
| 117–117 || Fragment size
|-
| 118–119 || Unused
|-
| 120–121 || Upper 16 bits of user ID
|-
| 122–123 || Upper 16 bits of group ID
|-
| 124–127 || Unused
|}
 
==== Filemode flags ====
{| class="wikitable"
|+ Bits 0-8
|-
! Permission Flag
! Description
|-
| 0x001 || Other—execute permission
|-
| 0x002 || Other—write permission
|-
| 0x004 || Other—read permission
|-
| 0x008 || Group—execute permission
|-
| 0x010 || Group—write permission
|-
| 0x020 || Group—read permission
|-
| 0x040 || User—execute permission
|-
| 0x080 || User—write permission
|-
| 0x100 || User—read permission
|}
 
{|
|+ Bits 9-11
|-
! Flag Value
! Description
|-
| 0x200 || Sticky bit
|-
| 0x400 || Set group ID
|-
| 0x800 || Set user ID
|}
 
Bits 12-15
 
{|
|-
! Tyep Value
! Description
|-
| 0x1000 || FIFO
|-
| 0x2000 || Character device
|-
| 0x4000 || Directory
|-
| 0x6000 || Block device
|-
| 0x8000 || Regular file
|-
| 0xA000 || Symbolic link
|-
| 0xC000 || Unix socket
|}
 
==== Inode flags ====
{|
|-
! Flag Value
! Description
|-
| 0x00000001 || Secure deletion (not used)
|-
| 0x00000002 || Keep a copy of data when deleted (not used)
|-
| 0x00000004 || File compression (not used)
|-
| 0x00000008 || Synchronous updates—new data is written immediately to disk
|-
| 0x00000010 || Immutable file (content cannot be changed)
|-
| 0x00000020 || Append only
|-
| 0x00000040 || File is not included in 'dump' command
|-
| 0x00000080 || A-time is not updated
|-
| 0x00001000 || Hash indexed directory
|-
| 0x00002000 || File data is journaled with Ext3
|}
 
== Links ==
Anonymous user