Anonymous user
ISO 9660: Difference between revisions
→Directories
[unchecked revision] | [unchecked revision] |
m (→Directories) |
|||
(31 intermediate revisions by 8 users not shown) | |||
Line 1:
{{Filesystems}}
'''ISO 9660''' is the standard
on DVD and BD media and may as well be present on USB sticks or hard disks.
Its specifications are available for free under the name '''ECMA-119'''.
== Overview and caveats ==
ISO 9660 is not a complex file system, but has a few quirks that are worth remembering. It seems that some operating systems also create non-compliant CDs, so beware! The main example of this is the character set that is available for file names. Strictly, filenames may only consist of uppercase letters A-Z, digits,
dots, and underscores. Further there is a semicolon which separates the visible file name from its version number suffix. Many operating systems also allow lower case letters and other characters. Linux's [[VFS]] displays lower case filenames to the user despite the CD contents actually containing upper case characters.
===
An ISO 9660 sector is normally 2 KiB long. Although the specification allows for alternative sector sizes, you will rarely find anything other than 2 KiB.
=== Numerical formats ===
Another quirk of the system is that it has several numbering formats and multi-byte numbers are often represented in ''both-endian'' format. The ISO 9660 standard specifies three ways to encode 16 and 32-bit integers, using either little-endian (least-significant byte first), big-endian (most-significant byte first), or a combination of both (little-endian followed by big-endian). Both-endian (LSB-MSB) fields are therefore twice as wide. For this reason, 32-bit LBA's often appear as 8 byte fields. Where a both-endian format is present, the x86 architecture makes use of the first little-endian sequence and ignores the big-endian sequence.
{| {{wikitable}}
! Encoding
! Description
|-
| int8 || Unsigned 8-bit integer.
|-
| sint8 || Signed 8-bit integer.
|-
| int16_LSB || Little-endian encoded unsigned 16-bit integer.
|-
| int16_MSB || Big-endian encoded unsigned 16-bit integer.
|-
| int16_LSB-MSB || Little-endian followed by big-endian encoded unsigned 16-bit integer.
|-
| sint16_LSB || Little-endian encoded signed 16-bit integer.
|-
| sint16_MSB || Big-endian encoded signed 16-bit integer.
|-
| sint16_LSB-MSB || Little-endian followed by big-endian encoded signed 16-bit integer.
|-
| int32_LSB || Little-endian encoded unsigned 32-bit integer.
|-
| int32_MSB || Big-endian encoded unsigned 32-bit integer.
|-
| int32_LSB-MSB || Little-endian followed by big-endian encoded unsigned 32-bit integer.
|-
| sint32_LSB || Little-endian encoded signed 32-bit integer.
|-
| sint32_MSB || Big-endian encoded signed 32-bit integer.
|-
| sint32_LSB-MSB || Little-endian followed by big-endian encoded signed 32-bit integer.
|}
=== Date/time format ===
The date/time format used in the ''Primary Volume Descriptor'' is denoted as ''dec-datetime'' and uses ASCII digits to represent the main parts of the date/time:
{| {{wikitable}}
!Offset
!Size
!Datatype
!Description
|-
| 0 || 4 || strD || Year from 1 to 9999.
|-
| 4 || 2 || strD || Month from 1 to 12.
|-
| 6 || 2 || strD || Day from 1 to 31.
|-
| 8 || 2 || strD || Hour from 0 to 23.
|-
| 10 || 2 || strD || Minute from 0 to 59.
|-
| 12 || 2 || strD || Second from 0 to 59.
|-
| 14 || 2 || strD || Hundredths of a second from 0 to 99.
|-
| 16 || 1 || int8 || Time zone offset from GMT in 15 minute intervals, starting at interval -48 (west) and running up to interval 52 (east). So value 0 indicates interval -48 which equals GMT-12 hours, and value 100 indicates interval 52 which equals GMT+13 hours.
|}
All fields except for the offset from GMT are in ASCII digits. When the date and time is not specified, all string fields are ASCII '0' (for a total of 16 ASCII zeroes) and the last field is binary zero.
=== String format ===
Character strings are encoded with ASCII encoding. The specification does not permit all characters. It defines two sets of characters: 'a-characters' and 'd-characters'. You will see these terms used in the descriptor tables throughout this article. The character sets are:
<pre>
a-characters: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 _
! " % & ' ( ) * + , - . / : ; < = > ?
</pre>
<pre>
d-characters: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 _
</
{| {{wikitable}}
! Encoding
! Description
|-
| strA || String with only ASCII a-characters, padded to the right with spaces.
|-
| strD || String with only ASCII d-characters, padded to the right with spaces.
|}
Note that not all CDs strictly adhere to the character sets specified in ISO 9660.
=== Filenames ===
Filenames must use d-character encoding (strD), plus dot and semicolon
which have to occur exactly once per filename.
Filenames are composed of a File Name, a dot, a File Name Extension,
a semicolon; and a version number in decimal digits. The latter two
are usually not displayed to the user.
There are three Levels of Interchange defined. Level 1 allows filenames
with a File Name length of 8 and an extension length of 3 (like MS-DOS).
Levels 2 and 3 allow File Name and File Name Extension to have a combined
length of up to 30 characters.
The ECMA-119 Directory Record format can hold composed names of up to 222
characters. This would violate the specs but must nevertheless be handled
by a reader of the filesystem.
=== Size Limitations ===
ISO 9660 filesystems can have up to 2 exp 32 blocks, i.e. 8 TiB.
Normally they will be restricted to the size of optical media.
(Currently up to 100 GiB with 4-layer BD-R.)
The maximum size of data files depends on the Level of Interchange
that is intended for the ISO filesystem. Levels 1 and 2 allow for
4 GiB - 1, because a single Directory Record can claim up to that
number of bytes. Level 3 allows to have multiple consequtive
Directory Records with the same name. They all are to be concatenated
to a single data file. This means that a single data file can nearly
fill up the full 8 TiB of image size.
== System Area ==
An ISO 9660 filesystem begins by 32 KiB which may be used for arbitrary
data. This is often used to store boot information for the case that
the ISO 9660 filesystem is not stored on optical media, but rather on
a hard-disk-like device, e.g. on a USB stick.
So be prepared to find at that location a Master Boot Record
(MBR, for BIOS), a GUID Partition Table (GPT, for EFI), or an
Apple Partition Map (APM).
== Volume Descriptors ==
When preparing to mount a CD, your first action will be reading the volume descriptors (specifically, you will be looking for the ''Primary Volume Descriptor'').
Since sectors 0x00-0x0F of the CD are reserved as System Area,
the Volume Descriptors can be found starting at sector 0x10 (16). The format of the volume descriptors is as follows:
{| {{wikitable}}
! Offset
! Length (bytes)
! Field
! Datatype
! Description
|-
| 0 || 1 || Type || int8 || Volume Descriptor type code (see below).
|-
| 1 || 5 || Identifier || strA || Always 'CD001'.
|-
| 6 || 1 || Version || int8 || Volume Descriptor Version (0x01).
|-
| 7 || 2041 || Data || - || Depends on the volume descriptor type.
|}
This means that each volume descriptor is therefore one sector (
=== Volume Descriptor Type Codes ===
The Volume Descriptor Type field specifies the type of Volume Descriptor:
{| {{wikitable}}
! Value
! Description
|-
| 0 ||
|-
| 1 || Primary Volume Descriptor
Line 68 ⟶ 181:
|}
When starting out with a basic CD, we are going to be interested in the Primary Volume Descriptor, which points us to the root directory and path tables, which both allow us to find any file on the CD. Using the path table is ideal for minimal implementations which do not wish to search the directory hierarchy node by node. This is slower (string comparisons across the entire
=== The Boot Record ===
The first type of Volume Descriptor is the "Boot Record". The descriptor format is as follows:
Line 77 ⟶ 189:
! Offset
! Length (bytes)
! Field
! Datatype
! Description
|-
| 0 || 1 || Type ||
|-
| 1 || 5 || Identifier ||
|-
| 6 || 1 || Version ||
|-
| 7 || 32 || Boot System Identifier ||
|-
| 39 || 32 || Boot Identifier ||
|-
| 71 || 1977 || Boot System Use ||
|}
The most common Boot System Use specification is [[El-Torito|El Torito]].
It records at bytes 71 to 74 as little-endian 32-bit number the
block address of the El Torito Boot Catalog. This catalog lists the
available boot images, which serve as starting points of booting systems.
=== The Primary Volume Descriptor ===
This is a lengthy descriptor, but it contains some very useful information for reading the rest of the file system.
Line 101 ⟶ 217:
! Offset
! Length (bytes)
! Field
! Datatype
! Description
|-
| 0 || 1 || Type Code || int8 || Always 0x01 for a Primary Volume Descriptor.
|-
| 1 || 5 || Standard Identifier || strA || Always 'CD001'.
|-
| 6 || 1 || Version || int8 || Always 0x01.
|-
| 7 || 1 || Unused || - || Always 0x00.
|-
| 8 || 32 || System Identifier || strA || The name of the system that can act upon sectors 0x00-0x0F for the volume
|-
| 40 || 32 || Volume Identifier || strD || Identification of this volume
|-
| 72 || 8 || Unused Field || - || All zeroes.
|-
| 80 || 8 || Volume Space Size || int32_LSB-MSB || Number of Logical Blocks in which the volume is recorded
|-
| 88 || 32 || Unused Field || - || All zeroes.
|-
| 120 || 4 || Volume Set Size || int16_LSB-MSB || The size of the set in this logical volume (number of disks)
|-
| 124 || 4 || Volume Sequence Number || int16_LSB-MSB || The number of this disk in the Volume Set
|-
| 128 || 4 || Logical Block Size || int16_LSB-MSB || The size in bytes of a logical block
|-
| 132 || 8 || Path Table Size || int32_LSB-MSB || The size in bytes of the path table
|-
| 140 || 4 || Location of Type-L Path Table || int32_LSB || LBA location of the path table
|-
| 144 || 4 || Location of the Optional Type-L Path Table || int32_LSB || LBA location of the optional path table
|-
| 148 || 4 || Location of Type-M Path Table || int32_MSB || LBA location of the path table
|-
| 152 || 4 || Location of Optional Type-M Path Table || int32_MSB || LBA location of the optional path table
|-
| 156 || 34 || Directory entry for the root directory
|-
| 190 || 128 || Volume Set Identifier || strD || Identifier of the volume set of which this volume is a member
|-
| 318 || 128 || Publisher Identifier || strA || The volume publisher
|-
| 446 || 128 || Data Preparer Identifier || strA || The identifier of the person(s) who prepared the data for this volume.
|-
| 574 || 128 || Application Identifier || strA || Identifies how the data are recorded on this volume.
|-
| 702 ||
|-
|
|-
| 776 || 37 || Bibliographic File Identifier ||
|-
| 813 || 17 || Volume Creation Date and Time ||
|-
| 830 || 17 || Volume Modification Date and Time ||
|-
| 847 || 17 || Volume Expiration Date and Time ||
|-
| 864 || 17 || Volume Effective Date and Time ||
|-
| 881 || 1 || File Structure Version ||
|-
| 882 || 1 || Unused || - || Always 0x00.
|-
| 883 || 512 || Application Used || - || Contents not defined by ISO 9660.
|-
| 1395 || 653 || Reserved || - || Reserved by ISO.
|}
The Volume Descriptor Set Terminator does not currently define bytes 7-2047 of its Volume Descriptor. This means that the only fields in use for the volume set terminator are the type code (255), the standard identifier ('CD001') and the descriptor version (0x01).
{| {{wikitable}}
! Offset
! Length (bytes)
! Field name
! Datatype
! Description
|-
| 0 || 1 || Type || int8 || 255 indicates a Volume Descriptor Set Terminator.
|-
|
|-
|
|}
== The Path Table ==
Line 231 ⟶ 333:
== Directories ==
At some point when reading from an ISO 9660 CD, you will need a directory record to locate a file, even if you generally use the path table to locate the directory initially. Unlike the path tables, there is only one version of each directory table, and multi byte numbers are in both-endian format. Every directory will start with 2 special entries: an empty string, describing the "." entry, and the string "\1" describing the ".." entry. A directory record is laid out as follows:
{| {{wikitable}}
!Offset
!Size
!Type
!Description
|-
| 0 || 1 || int8 || Length of Directory Record.
|-
| 1 || 1 || int8 || Extended Attribute Record length.
|-
| 2 || 8 || int32_LSB-MSB || Location of extent (LBA) in both-endian format.
|-
| 10 || 8 || int32_LSB_MSB || Data length (size of extent) in both-endian format.
|-
| 18 || 7 || see format below || Recording date and time
|-
| 25 || 1 ||
|-
| 26 || 1 || int8 || File unit size for files recorded in interleaved mode, zero otherwise.
|-
| 27 || 1 || int8 || Interleave gap size for files recorded in interleaved mode, zero otherwise.
|-
| 28 || 4 || int16_LSB-MSB || Volume sequence number - the volume that this extent is recorded on, in 16 bit both-endian format.
|-
| 32 || 1 || int8 || Length of file identifier (file name). This terminates with a ';' character followed by the file ID number in ASCII coded decimal ('1').
|-
| 33 || (variable) || strD || File identifier.
|-
| (variable) || 1 || -- || Padding field - zero if length of file identifier is
|-
| (variable) || (variable) || -- ||
System Use -
The remaining bytes up to the maximum record size of 255 may be used
for extensions of ISO 9660. The most common one is the System Use Share
Protocol (SUSP) and its application, the Rock Ridge Interchange Protocol
(RRIP).
|}
Line 322 ⟶ 432:
# Repeat steps 3 and 4 for the file identifier 'MYLOADER;1'.
# Scan the 'MYLOADER' directory for 'STAGE2.BIN;1'. If found, you can now use the LBA value to load your file in to memory.
== Rock Ridge and Joliet ==
There are two enhancements for ISO 9660 which make it more suitable for
the worlds of Unix and of MS-Windows. Both can be combined in the same
filesystem. So the reader often has the choice between three file name
spaces: Plain ISO, Rock Ridge, Joliet.
ISO and Rock Ridge will show the same tree of files but with different names.
Joliet can show a completely different tree than ISO.
Rock Ridge allows for file names of up to 255 characters of 8 bit. Only
the 0-byte and the slash ("/") may not be used. Further it adds the
file attributes which are specified by POSIX (owner, group, permissions,...)
and it allows for symbolic links.
Rock Ridge is an application of SUSP. It may be accompanied by other
SUSP applications like zisofs (compression of data files, Linux
specific), Apple ISO 9660 Extensions, Amiga AS entries, or Arbitrary
Attribute Interchange Protocol (AAIP: Extended Attributes and ACLs).
A reader of SUSP entries shall simply ignore all entry types
which it does not expect.
Joliet was defined by Microsoft Inc. to allow for filenames with
up to 64 UCS-2 characters (16 bit). It is implemented as separate
tree of Directory Records which begins by a root record in a
Supplementary Volume Descriptor. That descriptor is similar to a
Primary Volume Descriptor, but has a Type Code of 2.
== See Also ==
=== Articles ===
* [[El-Torito]], a standard for creating bootable CD-ROMs
* [[Mkisofs]], about ISO 9660 producing programs: mkisofs, genisoimage, xorriso
* [[Optical Drive]], an overview about how to operate optical drives and media
=== External links ===
* [http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-119.pdf ISO 9660 (ECMA-119) specification]
* [[wikipedia:ISO 9660|ISO 9960 on Wikipedia]]
* [https://dev.lovelyhq.com/libburnia/libisofs/raw/master/doc/boot_sectors.txt Boot entry points in ISO 9660 filesystems]
* [ftp://ftp.ymi.com/pub/rockridge/susp112.ps SUSP 1.12 (entries CE , PD , SP , ST , ER , ES)]
* [ftp://ftp.ymi.com/pub/rockridge/rrip112.ps Rock Ridge: RRIP 1.12 (SUSP entries PX , PN , SL , NM , CL , PL , RE , TF , SF , obsolete: RR)]
* [http://www.estamos.de/makecd/Rock_Ridge_Amiga_Specific Amiga SUSP entry AS]
* [https://dev.lovelyhq.com/libburnia/libisofs/raw/master/doc/susp_aaip_2_0.txt libisofs SUSP application AAIP (SUSP entry AL)]
* [http://www.buildorbuy.org/pdf/joliet.pdf Joliet addon-on specifications]
[[de:ISO9660]]
|