Ext2: Difference between revisions

From OSDev.wiki
Jump to navigation Jump to search
[unchecked revision][unchecked revision]
Content added Content deleted
m (corrected URL (had a typo and the site moved))
No edit summary
Line 1: Line 1:
Ext2/ext3 is used by many distributions of Linux. Ext2/3 is base on the Unix File System (UFS), ExtX has removed components that UFS no longer needed.
== About ext2fs (Second Extended Filesystem) ==


The Ext2 file system is split up into blocks, each block is then grouped together into block groups. Each block group contains a backup copy of the superblock, and group descriptor table.
The Second Extended Filesystem (ext2fs) was the default filesystem of Linux prior the advent of the journaling file systems ext3fs and ReiserFS. It has native support for UNIX ownership / access rights, symbolic and hard links and other Unix-native properties. Like HPFS, it tries to minimize head movement by distributing data across the disk. Also, by using "groups", it minimizes the impact of fragmentation. It is another "inode" based system. An ext2fs-partition is made up from blocks, which normally are 1K each. The first block (the bootblock) is zeroized, all the other blocks are divided into so-called block groups (normally, between 256 and 8192 blocks form a group). Each block group contains:


The superblock (which contains important information about the layout of the file system) is located at byte 1024 and is 1024 bytes in length
* a copy of the superblock (which is a mighty useful structure containing info about the filesystem);
* the filesystem descriptors (dunno what that is exactly)
* the block bitmap, tells which blocks are used
* the inode bitmap, tells which inodes are used (difference?)
* the inode table, which contains the inodes themselves
* the data blocks referenced by the inodes


== File Sytem Structure ==
The first inode is a special one; it is the bad blocks inode, which references all the damaged sectors of the partition. The fifth inode contains the bootloader, whereas the 11th contains the root directory.


The Ext2 file system is split up into blocks size defined in the superblock. each block is then grouped together into block groups. Each block group contains a backup copy of the superblock, and group descriptor table. Each block group also contains a block bitmap (bitmap of allocated blocks within the group) inode bitmap (bitmap of allocated inodes within the group) and an Inode Table.
Windows users can access ext2fs partitions with [http://www.chrysocome.net/explore2fs explore2fs].


=== Superblock ===
== About ext3fs (Third Extended File System) ==


The superblock (which contains important information about the layout of the file system) is located at byte 1024 and is 1024 bytes in length. From the superblock we can learn, the size of each block(bytes 24–27), Number of inodes (bytes 0-3), number of blocks (bytes 4-7), number of block per group (bytes 32–35), number of inodes in each group (bytes 40–43) and where the first block group is located. From this information we can find the group an inode belongs to, total number of groups in the filesystem.
ext3fs is basically ext2fs with journaling added. If your ext3fs partition does not need journal replay, it can even be accessed with a 'simple' ext2fs driver.

=== Group Descriptor Table ===

The Group Descriptor Table contains an entry for each block group within the filesystem. The table is located in the next block after the superblock. Each Descriptor contains information regarding whre inportant datastructure are for that group.

=== INodes ===

One inode is allocated to every file and directory, and each inode has an address. From a inode address, we can determine which group the inode is in, by using the following formula.

group = (inode – 1) / INODES_PER_GROUP

Inodes 1 to 10 are typically reserved and should be in an allocated state. The superblock has
the value of the first non-reserved inode. Of the reserved inodes, only number 2 has a specific
function, and it is used for the root directory. Inode 1 keeps track of bad blocks, but it does
not have any special status in the Linux kernel.

Each inode has a static number of fields, and additional information might be stored in
extended attributes and indirect block pointers. The allocation status of an inode is determined using the inode bitmap, whose location is
given in the group descriptor.

Ext2, like UFS, was designed for efficiency of small files. Therefore, each inode can store
the addresses of the first 12 blocks that a file has allocated. These are called direct pointers. If
a file needs more than 12 blocks, a block is allocated to store the remaining addresses. The
pointer to the block is called an indirect block pointer. The addresses in the block are all four
bytes, and the total number in each block is based on the block size. The indirect block
pointer is stored in the inode.

If a file has more blocks than can fit in the 12 direct pointers and the indirect block, a double indirect block is used. A double indirect block is when the inode points to a block that
contains a list of single indirect block pointers, each of which point to blocks that contain a
list of direct pointers. Lastly, if a file needs still more space, it can use a triple indirect block
pointer. A triple indirect block contains addresses of double indirect blocks, which contain
addresses of single indirect blocks. Each inode contains 12 direct pointers, one single
indirect pointer, one double indirect block pointer, and one triple indirect pointer.

An inode also contains the file's size, ownership, and temporal information. The size value in
newer versions of ExtX is 64 bits, but older versions had only 32 bits and therefore could not
handle files over 4GB. Newer versions utilize an unused field for the upper 32 bits of the size
value and set a read-only compatible feature flag when a large file exists.

"Ownership" information is stored using the user and group ID.

=== Directories ===
Directories are files which contains information needed to find files within the filesystem. The root directory is Inode 2.

== Implementation ==
=== Reading a file ===

To read a file the following step are needed:

# Read the superblock to find the size of each block, the number of blocks per group, number Inodes per group, the starting block of the first group.
# Read the first entry of the Group Descriptor Table, to find the location of the Inode table. get Inode 2, this will be the root directory.
# Read the inode the block information, to get where the directory information is located.
# Interate through the directory information to find the directory/file.
# Read the inode bitmap to find out weather the inode pointed to by the directory structure is allocated.
# if Allocated and is a directory go to step 3. If not allocated continue with step 4.
# Read the inode the block information, to get where the directory information is located.
# Read the first 12 blocks, (if file is less then 12 blocks, only read the number of blocks needed specified by the inode (i.e to get the number of block the file takes, divide the size of the file by the size of each block).
# If is larger than 12 blocks.
## Read indirect block.
## foreach block larger than 12 blocks, read the pointer from the block. (i.e if the reading block 13, pointer 1 from the block should be read).
## if the file needs more blocks. then read the double pointer will be a pointer to a block containing pointers to blocks of data.
## if the file still need more blocks, than read the triple indirect pointer.

== Data structures ==

=== Superblock ===
{| class="wikitable"
|-
! Byte Range
! Description
|-
| 0–3 || Number of inodes in file system
|-
| 4–7 || Number of blocks in file system
|-
| 8–11 || Number of blocks reserved to prevent file system from filling up
|-
| 12–15 || Number of unallocated blocks
|-
| 16–19 || Number of unallocated inodes
|-
| 20–23 || Block where block group 0 starts
|-
| 24–27 || Block size (saved as the number of places to shift 1,024 to the left)
|-
| 28–31 || Fragment size (saved as the number of bits to shift 1,024 to the left)
|-
| 32–35 || Number of blocks in each block group
|-
| 36–39 || Number of fragments in each block group
|-
| 40–43 || Number of inodes in each block group
|-
| 44–47 || Last mount time
|-
| 48–51 || Last written time
|-
| 52–53 || Current mount count
|-
| 54–55 || Maximum mount count
|-
| 56–57 || Signature (0xef53)
|-
| 58–59 || File system state (see below)
|-
| 60–61 || Error handling method (see below)
|-
| 62–63 || Minor version
|-
| 64–67 || Last consistency check time
|-
| 68–71 || Interval between forced consistency checks
|-
| 72–75 || Creator OS (see below)
|-
| 76–79 || Major version (see below)
|-
| 80–81 || UID that can use reserved blocks
|-
| 82–83 || GID that can use reserved blocks
|-
| 84–87 || First non-reserved inode in file system
|-
| 88–89 || Size of each inode structure
|-
| 90–91 || Block group that this superblock is part of (if backup copy)
|-
| 92–95 || Compatible feature flags (see below)
|-
| 96–99 || Incompatible feature flags (see Table below)
|-
| 100–103 || Read only feature flags (see Table below)
|-
| 104–119 || File system ID
|-
| 120–135 || Volume name
|-
| 136–199 || Path where last mounted on
|-
| 200–203 || Algorithm usage bitmap
|-
| 204–204 || Number of blocks to preallocate for files
|-
| 205–205 || Number of blocks to preallocate for directories
|-
| 206–207 || Unused
|-
| 208–223 || Journal ID
|-
| 224–227 || Journal inode
|-
| 228–231 || Journal device
|-
| 232–235 || Head of orphan inode list
|-
| 236–1023 || Unused
|}

==== System State Flags ====
{|
|-
! Flag Value
! Description
|-
| 0x0001 || File system is clean
|-
| 0x0002 || File system has errors
|-
| 0x0004 || Orphan inodes are being recovered
|-
|}
==== Error-Handling Flags ====
{|
|-
! Value
! Description
|-
| 1 || Continue
|-
| 2 || Remount file system as read only
|-
| 3 || Panic
|}

==== Creator OS Flags ====
{|
|-
! Value
! Description
|-
| 0 || Linux
|-
| 1 || GNU Hurd
|-
| 2 || Masix
|-
| 3 || FreeBSD
|-
| 4 || Lites
|}

==== Compatible Features Flags ====
{|
|-
! Flag Value
! Description
|-
| 0x0001 || Preallocate directory blocks to reduce fragmentation
|-
| 0x0002 || AFS server inodes exist
|-
| 0x0004 || File system has a journal (Ext3)
|-
| 0x0008 || Inodes have extended attributes
|-
| 0x0010 || File system can resize itself for larger partitions
|-
| 0x0020 || Directories use hash index
|}

==== Incompatible Features Flags ====
{|
|-
! Flag Value
! Description
|-
| 0x0001 || Compression
|-
| 0x0002 || Directory entries contain a file type field
|-
| 0x0004 || File system needs recovery
|-
| 0x0008 || File system uses a journal device
|}

==== Read Only Compatible Features Flags ====
{|
|-
! Flag Value
! Description
|-
| 0x0001 || Sparse superblocks and group descriptor tables
|-
| 0x0002 || File system contains a large file
|-
| 0x0004 || Directories use B-Trees
|}

=== Group Descriptor Table ===

Each entry contains the following

{|
|-
! Byte Range
! Description
|-
| 0–3 || Starting block address of block bitmap
|-
| 4–7 || Starting block address of inode bitmap
|-
| 8–11 || Starting block address of inode table
|-
| 12–13 || Number of unallocated blocks in group
|-
| 14–15 || Number of unallocated inodes in group
|-
| 16–17 || Number of directories in group
|-
| 18–31 || Unused
|}

=== Inode ===

{|
|-
! Byte Range
! Description
|-
| 0–1 || File mode (type and permissions)
|-
| 2–3 || Lower 16 bits of user ID
|-
| 4–7 || Lower 32 bits of size in bytes
|-
| 8–11 || Access Time
|-
| 12–15 || Change Time
|-
| 16–19 || Modification time
|-
| 20–23 || Deletion time
|-
| 24–25 || Lower 16 bits of group ID
|-
| 26–27 || Link count
|-
| 28–31 || Sector count
|-
| 32–35 || Flags
|-
| 36–39 || Unused
|-
| 40–87 || 12 direct block pointers
|-
| 88–91 || 1 single indirect block pointer
|-
| 92–95 || 1 double indirect block pointer
|-
| 96–99 || 1 triple indirect block pointer
|-
| 100–103 || Generation number (NFS)
|-
| 104–107 || Extended attribute block (File ACL)
|-
| 108–111 || Upper 32 bits of size / Directory ACL Yes /
|-
| 112–115 || Block address of fragment
|-
| 116–116 || Fragment index in block
|-
| 117–117 || Fragment size
|-
| 118–119 || Unused
|-
| 120–121 || Upper 16 bits of user ID
|-
| 122–123 || Upper 16 bits of group ID
|-
| 124–127 || Unused
|}

==== Filemode flags ====
{| class="wikitable"
|+ Bits 0-8
|-
! Permission Flag
! Description
|-
| 0x001 || Other—execute permission
|-
| 0x002 || Other—write permission
|-
| 0x004 || Other—read permission
|-
| 0x008 || Group—execute permission
|-
| 0x010 || Group—write permission
|-
| 0x020 || Group—read permission
|-
| 0x040 || User—execute permission
|-
| 0x080 || User—write permission
|-
| 0x100 || User—read permission
|}

{|
|+ Bits 9-11
|-
! Flag Value
! Description
|-
| 0x200 || Sticky bit
|-
| 0x400 || Set group ID
|-
| 0x800 || Set user ID
|}

Bits 12-15

{|
|-
! Tyep Value
! Description
|-
| 0x1000 || FIFO
|-
| 0x2000 || Character device
|-
| 0x4000 || Directory
|-
| 0x6000 || Block device
|-
| 0x8000 || Regular file
|-
| 0xA000 || Symbolic link
|-
| 0xC000 || Unix socket
|}

==== Inode flags ====
{|
|-
! Flag Value
! Description
|-
| 0x00000001 || Secure deletion (not used)
|-
| 0x00000002 || Keep a copy of data when deleted (not used)
|-
| 0x00000004 || File compression (not used)
|-
| 0x00000008 || Synchronous updates—new data is written immediately to disk
|-
| 0x00000010 || Immutable file (content cannot be changed)
|-
| 0x00000020 || Append only
|-
| 0x00000040 || File is not included in 'dump' command
|-
| 0x00000080 || A-time is not updated
|-
| 0x00001000 || Hash indexed directory
|-
| 0x00002000 || File data is journaled with Ext3
|}


== Links ==
== Links ==

Revision as of 22:33, 16 December 2007

Ext2/ext3 is used by many distributions of Linux. Ext2/3 is base on the Unix File System (UFS), ExtX has removed components that UFS no longer needed.

The Ext2 file system is split up into blocks, each block is then grouped together into block groups. Each block group contains a backup copy of the superblock, and group descriptor table.

The superblock (which contains important information about the layout of the file system) is located at byte 1024 and is 1024 bytes in length

File Sytem Structure

The Ext2 file system is split up into blocks size defined in the superblock. each block is then grouped together into block groups. Each block group contains a backup copy of the superblock, and group descriptor table. Each block group also contains a block bitmap (bitmap of allocated blocks within the group) inode bitmap (bitmap of allocated inodes within the group) and an Inode Table.

Superblock

The superblock (which contains important information about the layout of the file system) is located at byte 1024 and is 1024 bytes in length. From the superblock we can learn, the size of each block(bytes 24–27), Number of inodes (bytes 0-3), number of blocks (bytes 4-7), number of block per group (bytes 32–35), number of inodes in each group (bytes 40–43) and where the first block group is located. From this information we can find the group an inode belongs to, total number of groups in the filesystem.

Group Descriptor Table

The Group Descriptor Table contains an entry for each block group within the filesystem. The table is located in the next block after the superblock. Each Descriptor contains information regarding whre inportant datastructure are for that group.

INodes

One inode is allocated to every file and directory, and each inode has an address. From a inode address, we can determine which group the inode is in, by using the following formula.

  group = (inode – 1) / INODES_PER_GROUP

Inodes 1 to 10 are typically reserved and should be in an allocated state. The superblock has the value of the first non-reserved inode. Of the reserved inodes, only number 2 has a specific function, and it is used for the root directory. Inode 1 keeps track of bad blocks, but it does not have any special status in the Linux kernel.

Each inode has a static number of fields, and additional information might be stored in extended attributes and indirect block pointers. The allocation status of an inode is determined using the inode bitmap, whose location is given in the group descriptor.

Ext2, like UFS, was designed for efficiency of small files. Therefore, each inode can store the addresses of the first 12 blocks that a file has allocated. These are called direct pointers. If a file needs more than 12 blocks, a block is allocated to store the remaining addresses. The pointer to the block is called an indirect block pointer. The addresses in the block are all four bytes, and the total number in each block is based on the block size. The indirect block pointer is stored in the inode.

If a file has more blocks than can fit in the 12 direct pointers and the indirect block, a double indirect block is used. A double indirect block is when the inode points to a block that contains a list of single indirect block pointers, each of which point to blocks that contain a list of direct pointers. Lastly, if a file needs still more space, it can use a triple indirect block pointer. A triple indirect block contains addresses of double indirect blocks, which contain addresses of single indirect blocks. Each inode contains 12 direct pointers, one single indirect pointer, one double indirect block pointer, and one triple indirect pointer.

An inode also contains the file's size, ownership, and temporal information. The size value in newer versions of ExtX is 64 bits, but older versions had only 32 bits and therefore could not handle files over 4GB. Newer versions utilize an unused field for the upper 32 bits of the size value and set a read-only compatible feature flag when a large file exists.

"Ownership" information is stored using the user and group ID.

Directories

Directories are files which contains information needed to find files within the filesystem. The root directory is Inode 2.

Implementation

Reading a file

To read a file the following step are needed:

  1. Read the superblock to find the size of each block, the number of blocks per group, number Inodes per group, the starting block of the first group.
  2. Read the first entry of the Group Descriptor Table, to find the location of the Inode table. get Inode 2, this will be the root directory.
  3. Read the inode the block information, to get where the directory information is located.
  4. Interate through the directory information to find the directory/file.
  5. Read the inode bitmap to find out weather the inode pointed to by the directory structure is allocated.
  6. if Allocated and is a directory go to step 3. If not allocated continue with step 4.
  7. Read the inode the block information, to get where the directory information is located.
  8. Read the first 12 blocks, (if file is less then 12 blocks, only read the number of blocks needed specified by the inode (i.e to get the number of block the file takes, divide the size of the file by the size of each block).
  9. If is larger than 12 blocks.
    1. Read indirect block.
    2. foreach block larger than 12 blocks, read the pointer from the block. (i.e if the reading block 13, pointer 1 from the block should be read).
    3. if the file needs more blocks. then read the double pointer will be a pointer to a block containing pointers to blocks of data.
    4. if the file still need more blocks, than read the triple indirect pointer.

Data structures

Superblock

Byte Range Description
0–3 Number of inodes in file system
4–7 Number of blocks in file system
8–11 Number of blocks reserved to prevent file system from filling up
12–15 Number of unallocated blocks
16–19 Number of unallocated inodes
20–23 Block where block group 0 starts
24–27 Block size (saved as the number of places to shift 1,024 to the left)
28–31 Fragment size (saved as the number of bits to shift 1,024 to the left)
32–35 Number of blocks in each block group
36–39 Number of fragments in each block group
40–43 Number of inodes in each block group
44–47 Last mount time
48–51 Last written time
52–53 Current mount count
54–55 Maximum mount count
56–57 Signature (0xef53)
58–59 File system state (see below)
60–61 Error handling method (see below)
62–63 Minor version
64–67 Last consistency check time
68–71 Interval between forced consistency checks
72–75 Creator OS (see below)
76–79 Major version (see below)
80–81 UID that can use reserved blocks
82–83 GID that can use reserved blocks
84–87 First non-reserved inode in file system
88–89 Size of each inode structure
90–91 Block group that this superblock is part of (if backup copy)
92–95 Compatible feature flags (see below)
96–99 Incompatible feature flags (see Table below)
100–103 Read only feature flags (see Table below)
104–119 File system ID
120–135 Volume name
136–199 Path where last mounted on
200–203 Algorithm usage bitmap
204–204 Number of blocks to preallocate for files
205–205 Number of blocks to preallocate for directories
206–207 Unused
208–223 Journal ID
224–227 Journal inode
228–231 Journal device
232–235 Head of orphan inode list
236–1023 Unused

System State Flags

Flag Value Description
0x0001 File system is clean
0x0002 File system has errors
0x0004 Orphan inodes are being recovered

Error-Handling Flags

Value Description
1 Continue
2 Remount file system as read only
3 Panic

Creator OS Flags

Value Description
0 Linux
1 GNU Hurd
2 Masix
3 FreeBSD
4 Lites

Compatible Features Flags

Flag Value Description
0x0001 Preallocate directory blocks to reduce fragmentation
0x0002 AFS server inodes exist
0x0004 File system has a journal (Ext3)
0x0008 Inodes have extended attributes
0x0010 File system can resize itself for larger partitions
0x0020 Directories use hash index

Incompatible Features Flags

Flag Value Description
0x0001 Compression
0x0002 Directory entries contain a file type field
0x0004 File system needs recovery
0x0008 File system uses a journal device

Read Only Compatible Features Flags

Flag Value Description
0x0001 Sparse superblocks and group descriptor tables
0x0002 File system contains a large file
0x0004 Directories use B-Trees

Group Descriptor Table

Each entry contains the following

Byte Range Description
0–3 Starting block address of block bitmap
4–7 Starting block address of inode bitmap
8–11 Starting block address of inode table
12–13 Number of unallocated blocks in group
14–15 Number of unallocated inodes in group
16–17 Number of directories in group
18–31 Unused

Inode

Byte Range Description
0–1 File mode (type and permissions)
2–3 Lower 16 bits of user ID
4–7 Lower 32 bits of size in bytes
8–11 Access Time
12–15 Change Time
16–19 Modification time
20–23 Deletion time
24–25 Lower 16 bits of group ID
26–27 Link count
28–31 Sector count
32–35 Flags
36–39 Unused
40–87 12 direct block pointers
88–91 1 single indirect block pointer
92–95 1 double indirect block pointer
96–99 1 triple indirect block pointer
100–103 Generation number (NFS)
104–107 Extended attribute block (File ACL)
108–111 Upper 32 bits of size / Directory ACL Yes /
112–115 Block address of fragment
116–116 Fragment index in block
117–117 Fragment size
118–119 Unused
120–121 Upper 16 bits of user ID
122–123 Upper 16 bits of group ID
124–127 Unused

Filemode flags

Bits 0-8
Permission Flag Description
0x001 Other—execute permission
0x002 Other—write permission
0x004 Other—read permission
0x008 Group—execute permission
0x010 Group—write permission
0x020 Group—read permission
0x040 User—execute permission
0x080 User—write permission
0x100 User—read permission
Bits 9-11
Flag Value Description
0x200 Sticky bit
0x400 Set group ID
0x800 Set user ID

Bits 12-15

Tyep Value Description
0x1000 FIFO
0x2000 Character device
0x4000 Directory
0x6000 Block device
0x8000 Regular file
0xA000 Symbolic link
0xC000 Unix socket

Inode flags

Flag Value Description
0x00000001 Secure deletion (not used)
0x00000002 Keep a copy of data when deleted (not used)
0x00000004 File compression (not used)
0x00000008 Synchronous updates—new data is written immediately to disk
0x00000010 Immutable file (content cannot be changed)
0x00000020 Append only
0x00000040 File is not included in 'dump' command
0x00000080 A-time is not updated
0x00001000 Hash indexed directory
0x00002000 File data is journaled with Ext3

Links