ISO 9660: Difference between revisions

[unchecked revision][unchecked revision]
 
(31 intermediate revisions by 8 users not shown)
Line 1:
{{Filesystems}}
'''ISO 9660''' is the standard filesystemfile system for CD-ROMs. It is also widely used
on DVD and BD media and may as well be present on USB sticks or hard disks.
Its specifications are available for free under the name '''ECMA-119'''.
 
== Overview and caveats ==
{{In Progress}}
ISO 9660 is not a complex file system, but has a few quirks that are worth remembering. It seems that some operating systems also create non-compliant CDs, so beware! The main example of this is the character set that is available for file names. Strictly, filenames may only consist of uppercase letters A-Z, digits,
dots, and underscores. Further there is a semicolon which separates the visible file name from its version number suffix. Many operating systems also allow lower case letters and other characters. Linux's [[VFS]] displays lower case filenames to the user despite the CD contents actually containing upper case characters.
 
=== OverviewSector and Caveatssize ===
An ISO 9660 sector is normally 2 KiB long. Although the specification allows for alternative sector sizes, you will rarely find anything other than 2 KiB.
ISO 9660 is not a complex file system, but has a few quirks that are worth remembering. It seems that some operating systems also create non-compliant CD's, so beware! The main example of this is the character set that is available for file names. Strictly, this consists of A-Z (upper case only!), digits and underscores. Many operating systems also allow lower case letters and other characters. Linux's [[VFS]] displays lower case filenames to the user despite the cd contents actually containing upper case characters.
 
=== Numerical formats ===
Another (perhaps little-known and little-utilised) feature of the ISO 9660 file system is that a single file system can span multiple CD's. This is dealt with via "set numbers".
Another quirk of the system is that it has several numbering formats and multi-byte numbers are often represented in ''both-endian'' format. The ISO 9660 standard specifies three ways to encode 16 and 32-bit integers, using either little-endian (least-significant byte first), big-endian (most-significant byte first), or a combination of both (little-endian followed by big-endian). Both-endian (LSB-MSB) fields are therefore twice as wide. For this reason, 32-bit LBA's often appear as 8 byte fields. Where a both-endian format is present, the x86 architecture makes use of the first little-endian sequence and ignores the big-endian sequence.
 
{| {{wikitable}}
=== Sector Size ===
! Encoding
An ISO 9660 sector is normally 2KiB long. Although the specification allows for alternative sector sizes, you will rarely find anything other than 2KiB.
! Description
|-
| int8 || Unsigned 8-bit integer.
|-
| sint8 || Signed 8-bit integer.
|-
| int16_LSB || Little-endian encoded unsigned 16-bit integer.
|-
| int16_MSB || Big-endian encoded unsigned 16-bit integer.
|-
| int16_LSB-MSB || Little-endian followed by big-endian encoded unsigned 16-bit integer.
|-
| sint16_LSB || Little-endian encoded signed 16-bit integer.
|-
| sint16_MSB || Big-endian encoded signed 16-bit integer.
|-
| sint16_LSB-MSB || Little-endian followed by big-endian encoded signed 16-bit integer.
|-
| int32_LSB || Little-endian encoded unsigned 32-bit integer.
|-
| int32_MSB || Big-endian encoded unsigned 32-bit integer.
|-
| int32_LSB-MSB || Little-endian followed by big-endian encoded unsigned 32-bit integer.
|-
| sint32_LSB || Little-endian encoded signed 32-bit integer.
|-
| sint32_MSB || Big-endian encoded signed 32-bit integer.
|-
| sint32_LSB-MSB || Little-endian followed by big-endian encoded signed 32-bit integer.
|}
 
=== Numerical Formats ===
 
=== Date/time format ===
Another quirk of the system is that it has several numbering formats and multi-byte numbers are often represented in "both-endian" format (that is, LSB first followed by MSB first). For this reason, 32 bit LBA's often appear in 8 byte fields. ISO 9660 section 7.3, for example, outlines these 32 bit systems and numbers throughout the documents are referred to as being 7.3.1 format (LSB first), 7.3.2 format (MSB first) or 7.3.3 format (both-endian). Where a both-endian format exists, the x86 architecture makes use of the first 32 bit sequence (the LSB-first sequence) and ignores bytes 4-7.
 
The date/time format used in the ''Primary Volume Descriptor'' is denoted as ''dec-datetime'' and uses ASCII digits to represent the main parts of the date/time:
=== Permitted Characters ===
The specification refers to two sets of characters: 'a-characters' and 'd-characters'. You will see these terms used in the descriptor tables throughout this article. The character sets are:
 
{| {{wikitable}}
<Pre>
!Offset
!Size
!Datatype
!Description
|-
| 0 || 4 || strD || Year from 1 to 9999.
|-
| 4 || 2 || strD || Month from 1 to 12.
|-
| 6 || 2 || strD || Day from 1 to 31.
|-
| 8 || 2 || strD || Hour from 0 to 23.
|-
| 10 || 2 || strD || Minute from 0 to 59.
|-
| 12 || 2 || strD || Second from 0 to 59.
|-
| 14 || 2 || strD || Hundredths of a second from 0 to 99.
|-
| 16 || 1 || int8 || Time zone offset from GMT in 15 minute intervals, starting at interval -48 (west) and running up to interval 52 (east). So value 0 indicates interval -48 which equals GMT-12 hours, and value 100 indicates interval 52 which equals GMT+13 hours.
|}
 
All fields except for the offset from GMT are in ASCII digits. When the date and time is not specified, all string fields are ASCII '0' (for a total of 16 ASCII zeroes) and the last field is binary zero.
 
=== String format ===
Character strings are encoded with ASCII encoding. The specification does not permit all characters. It defines two sets of characters: 'a-characters' and 'd-characters'. You will see these terms used in the descriptor tables throughout this article. The character sets are:
 
<pre>
a-characters: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 _
! " % & ' ( ) * + , - . / : ; < = > ?
</pre>
 
<pre>
d-characters: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 _
</Prepre>
 
{| {{wikitable}}
== Volume Descriptors ==
! Encoding
! Description
|-
| strA || String with only ASCII a-characters, padded to the right with spaces.
|-
| strD || String with only ASCII d-characters, padded to the right with spaces.
|}
 
Note that not all CDs strictly adhere to the character sets specified in ISO 9660.
When preparing to mount a CD, your first action will be reading the volume descriptors (specifically, you will be looking for the Primary Volume Descriptor).
 
=== Filenames ===
Sectors 0x00-0x0F of the CD are reserved for system use. This means that the Volume Descriptors can be found starting at sector 0x10. The format of the volume descriptors is as follows:
Filenames must use d-character encoding (strD), plus dot and semicolon
which have to occur exactly once per filename.
Filenames are composed of a File Name, a dot, a File Name Extension,
a semicolon; and a version number in decimal digits. The latter two
are usually not displayed to the user.
 
There are three Levels of Interchange defined. Level 1 allows filenames
with a File Name length of 8 and an extension length of 3 (like MS-DOS).
Levels 2 and 3 allow File Name and File Name Extension to have a combined
length of up to 30 characters.
 
The ECMA-119 Directory Record format can hold composed names of up to 222
characters. This would violate the specs but must nevertheless be handled
by a reader of the filesystem.
 
=== Size Limitations ===
ISO 9660 filesystems can have up to 2 exp 32 blocks, i.e. 8 TiB.
Normally they will be restricted to the size of optical media.
(Currently up to 100 GiB with 4-layer BD-R.)
 
The maximum size of data files depends on the Level of Interchange
that is intended for the ISO filesystem. Levels 1 and 2 allow for
4 GiB - 1, because a single Directory Record can claim up to that
number of bytes. Level 3 allows to have multiple consequtive
Directory Records with the same name. They all are to be concatenated
to a single data file. This means that a single data file can nearly
fill up the full 8 TiB of image size.
 
== System Area ==
An ISO 9660 filesystem begins by 32 KiB which may be used for arbitrary
data. This is often used to store boot information for the case that
the ISO 9660 filesystem is not stored on optical media, but rather on
a hard-disk-like device, e.g. on a USB stick.
 
So be prepared to find at that location a Master Boot Record
(MBR, for BIOS), a GUID Partition Table (GPT, for EFI), or an
Apple Partition Map (APM).
 
== Volume Descriptors ==
When preparing to mount a CD, your first action will be reading the volume descriptors (specifically, you will be looking for the ''Primary Volume Descriptor'').
 
Since sectors 0x00-0x0F of the CD are reserved as System Area,
the Volume Descriptors can be found starting at sector 0x10 (16). The format of the volume descriptors is as follows:
 
{| {{wikitable}}
! Offset
! Length (bytes)
! Field Namename
! Datatype
! Description
|-
| 0 || 1 || Type || int8 || Volume Descriptor type code (see below).
|-
| 1 || 5 || Identifier || strA || Always 'CD001'.
|-
| 6 || 1 || Version || int8 || Volume Descriptor Version (0x01).
|-
| 7 || 2041 || Data || - || Depends on the volume descriptor type.
|}
 
This means that each volume descriptor is therefore one sector (2KiB2 KiB) long.
 
=== Volume Descriptor Type Codes ===
The Volume Descriptor Type field specifies the type of Volume Descriptor:
 
{| {{wikitable}}
! Value
! Description
|-
| 0 || Volume descriptor is a Boot Record
|-
| 1 || Primary Volume Descriptor
Line 68 ⟶ 181:
|}
 
When starting out with a basic CD, we are going to be interested in the Primary Volume Descriptor, which points us to the root directory and path tables, which both allow us to find any file on the CD. Using the path table is ideal for minimal implementations which do not wish to search the directory hierarchy node by node. This is slower (string comparisons across the entire filesystemfile system) but easier to implement.
 
=== The Boot Record ===
 
The first type of Volume Descriptor is the "Boot Record". The descriptor format is as follows:
 
Line 77 ⟶ 189:
! Offset
! Length (bytes)
! Field Namename
! Datatype
! Value
! Description
|-
| 0 || 1 || Type || 0int8 || Zero indicates a boot record.
|-
| 1 || 5 || Identifier || 'CD001'strA || Always '"CD001'".
|-
| 6 || 1 || Version || 0x01int8 || Volume Descriptor Version (0x01).
|-
| 7 || 32 || Boot System Identifier || (string)strA || ID of the system which can act on and boot the system from the boot record in a-characters.
|-
| 39 || 32 || Boot Identifier || (string)strA || Identification of the boot system defined in the rest of this descriptor in a-characters.
|-
| 71 || 1977 || Boot System Use || Unspecified- || Custom - used by the boot system.
|}
 
The most common Boot System Use specification is [[El-Torito|El Torito]].
=== The Primary Volume Descriptor ===
It records at bytes 71 to 74 as little-endian 32-bit number the
block address of the El Torito Boot Catalog. This catalog lists the
available boot images, which serve as starting points of booting systems.
 
=== The Primary Volume Descriptor ===
This is a lengthy descriptor, but it contains some very useful information for reading the rest of the file system.
 
Line 101 ⟶ 217:
! Offset
! Length (bytes)
! Field Namename
! Datatype
! Description
|-
| 0 || 1 || Type Code || int8 || Always 0x01 for a Primary Volume Descriptor.
|-
| 1 || 5 || Standard Identifier || strA || Always 'CD001'.
|-
| 6 || 1 || Version || int8 || Always 0x01.
|-
| 7 || 1 || Unused || - || Always 0x00.
|-
| 8 || 32 || System Identifier || strA || The name of the system that can act upon sectors 0x00-0x0F for the volume in a-characters.
|-
| 40 || 32 || Volume Identifier || strD || Identification of this volume in d-characters.
|-
| 72 || 8 || Unused Field || - || All zeroes.
|-
| 80 || 8 || Volume Space Size || int32_LSB-MSB || Number of Logical Blocks in which the volume is recorded. This is a 32 bit value in both-endian format.
|-
| 88 || 32 || Unused Field || - || All zeroes.
|-
| 120 || 4 || Volume Set Size || int16_LSB-MSB || The size of the set in this logical volume (number of disks). This is a 16 bit value in both-endian format.
|-
| 124 || 4 || Volume Sequence Number || int16_LSB-MSB || The number of this disk in the Volume Set. This is a 16 bit value in both-endian format.
|-
| 128 || 4 || Logical Block Size || int16_LSB-MSB || The size in bytes of a logical block in both-endian format. NB: This means that a logical block on a CD could be something other than 2KiB2 KiB!
|-
| 132 || 8 || Path Table Size || int32_LSB-MSB || The size in bytes of the path table in 32 bit both-endian format.
|-
| 140 || 4 || Location of Type-L Path Table || int32_LSB || LBA location of the path table, recorded in LSB-first (little endian) format. The path table pointed to also contains LSBonly little-firstendian values.
|-
| 144 || 4 || Location of the Optional Type-L Path Table || int32_LSB || LBA location of the optional path table, recorded in LSB-first (little endian) format. The path table pointed to also contains LSBonly little-firstendian values. Zero means that no optional path table exists.
|-
| 148 || 4 || Location of Type-M Path Table || int32_MSB || LBA location of the path table, recorded in MSB-first (big-endian) format. The path table pointed to also contains MSBonly big-firstendian values.
|-
| 152 || 4 || Location of Optional Type-M Path Table || int32_MSB || LBA location of the optional path table, recorded in MSB-first (big-endian) format. The path table pointed to also contains MSBonly big-firstendian values. Zero means that no optional path table exists.
|-
| 156 || 34 || Directory entry for the root directory. || - || Note that this is not an LBA address, it is the actual Directory Record, which contains a zero-lengthsingle byte Directory Identifier (0x00), hence the fixed 34 byte size.
|-
| 190 || 128 || Volume Set Identifier || strD || Identifier of the volume set of which this volume is a member in d-characters.
|-
| 318 || 128 || Publisher Identifier || strA || The volume publisher in a-characters. If unspecified, all bytes should be 0x20. For extended publisher information, the first byte should be 0x5F, followed by anthe 8.3filename formatof file name. Thisa file must be in the root directory. andIf thenot filenamespecified, isall madebytes fromshould d-charactersbe 0x20.
|-
| 446 || 128 || Data Preparer Identifier || strA || The identifier of the person(s) who prepared the data for this volume. FormatFor extended preparation information, the first byte should be 0x5F, followed by the filename of a file in the root directory. If not specified, all asbytes pershould Publisherbe Identifier0x20.
|-
| 574 || 128 || Application Identifier || strA || Identifies how the data are recorded on this volume. FormatFor extended information, the first byte should be 0x5F, followed by the filename of a file in the root directory. If not specified, all asbytes pershould Publisherbe Identifier0x20.
|-
| 702 || 3837 || Copyright File Identifier || IdentifiesstrD a|| fileFilename containingof copyright information for this volume set. Thea file must be contained in the root directory andthat iscontains incopyright 8.3information format.for Ifthis novolume suchset. fileIf isnot identifiedspecified, theall charactersbytes inshould this field are all set tobe 0x20.
|-
| 740739 || 3637 || Abstract File Identifier || IdentifiesstrD || Filename of a file containingin the root directory that contains abstract information for this volume set. inIf thenot samespecified, formatall asbytes the Copyrightshould File Identifierbe field0x20.
|-
| 776 || 37 || Bibliographic File Identifier || IdentifiesstrD || Filename of a file containingin the root directory that contains bibliographic information for this volume set. FormatIf asnot perspecified, theall otherbytes Fileshould Identifierbe fields0x20.
|-
| 813 || 17 || Volume Creation Date and Time || Datedec-datetime || The date and Timetime of when formatthe asvolume specifiedwas belowcreated.
|-
| 830 || 17 || Volume Modification Date and Time || Datedec-datetime || The date and Timetime of when formatthe asvolume specifiedwas belowmodified.
|-
| 847 || 17 || Volume Expiration Date and Time || Datedec-datetime and|| TimeThe formatdate asand specifiedtime below.after Afterwhich this datevolume andis time,considered the volume shouldto be considered obsolete. If unspecifiednot specified, then the informationvolume is never considered to be obsolete.
|-
| 864 || 17 || Volume Effective Date and Time || Datedec-datetime and|| TimeThe format as specified below. Datedate and time fromafter which the volume shouldmay be used. If unspecifiednot specified, the volume may be used immediately.
|-
| 881 || 1 || File Structure Version || Anint8 8|| bit number specifying theThe directory records and path table version (always 0x01).
|-
| 882 || 1 || Unused || - || Always 0x00.
|-
| 883 || 512 || Application Used || - || Contents not defined by ISO 9660.
|-
| 1395 || 653 || Reserved || - || Reserved by ISO.
|}
 
==== DateVolume andDescriptor TimeSet FormatTerminator ====
 
The Volume Descriptor Set Terminator does not currently define bytes 7-2047 of its Volume Descriptor. This means that the only fields in use for the volume set terminator are the type code (255), the standard identifier ('CD001') and the descriptor version (0x01).
The date / time format used in the Primary Volume Descriptor is:
 
{| {{wikitable}}
! Offset
! Length (bytes)
!Size
! Field name
!Description
! Datatype
|-
! Description
| 0 || 4 || Year from 1 to 9999.
|-
| 4 || 2 || Month from 1 to 12.
|-
| 6 || 2 || Day from 1 to 31.
|-
| 8 || 2 || Hour from 0 to 23.
|-
| 10 || 2 || Minute from 0 to 59.
|-
| 0 || 1 || Type || int8 || 255 indicates a Volume Descriptor Set Terminator.
| 12 || 2 || Second from 0 to 59.
|-
| 141 || 25 || HundredthsIdentifier of|| astrA second from 0|| toAlways 99"CD001".
|-
| 166 || 1 || OffsetVersion from|| GMTint8 in|| 15 minute intervals from -48 (West)Volume toDescriptor +52Version (East0x01).
|}
 
All fields except for the offset from GMT are in ASCII digits. An unspecified date and time is represented by 16 '0' digits, followed by a zero in the last field.
 
 
=== Volume Descriptor Set Terminator ===
 
The Volume Descriptor Set Terminator does not currently define bytes 7-2047 of its Volume Descriptor. This means that the only fields in use for the volume set terminator are the type code (255), the standard identifier ('CD001') and the descriptor version (0x01).
 
== The Path Table ==
Line 231 ⟶ 333:
 
== Directories ==
At some point when reading from an ISO 9660 CD, you will need a directory record to locate a file, even if you generally use the path table to locate the directory initially. Unlike the path tables, there is only one version of each directory table, and multi byte numbers are in both-endian format. Every directory will start with 2 special entries: an empty string, describing the "." entry, and the string "\1" describing the ".." entry. A directory record is laid out as follows:
 
{| {{wikitable}}
!Offset
!Size
!Type
!Description
|-
| 0 || 1 || int8 || Length of Directory Record.
|-
| 1 || 1 || int8 || Extended Attribute Record length.
|-
| 2 || 8 || int32_LSB-MSB || Location of extent (LBA) in both-endian format.
|-
| 10 || 8 || int32_LSB_MSB || Data length (size of extent) in both-endian format.
|-
| 18 || 7 || see format below || Recording date and time (see format below).
|-
| 25 || 1 || File flags (see below) || File flags.
|-
| 26 || 1 || int8 || File unit size for files recorded in interleaved mode, zero otherwise.
|-
| 27 || 1 || int8 || Interleave gap size for files recorded in interleaved mode, zero otherwise.
|-
| 28 || 4 || int16_LSB-MSB || Volume sequence number - the volume that this extent is recorded on, in 16 bit both-endian format.
|-
| 32 || 1 || int8 || Length of file identifier (file name). This terminates with a ';' character followed by the file ID number in ASCII coded decimal ('1').
|-
| 33 || (variable) || strD || File identifier.
|-
| (variable) || 1 || -- || Padding field - zero if length of file identifier is oddeven, otherwise, this field is not present. This means that a directory entry will always start on an even byte number.
|-
| (variable) || (variable) || -- ||
System Use -
The remaining bytes up to the maximum record size of 255 may be used
for extensions of ISO 9660. The most common one is the System Use Share
Protocol (SUSP) and its application, the Rock Ridge Interchange Protocol
(RRIP).
|}
 
Line 322 ⟶ 432:
# Repeat steps 3 and 4 for the file identifier 'MYLOADER;1'.
# Scan the 'MYLOADER' directory for 'STAGE2.BIN;1'. If found, you can now use the LBA value to load your file in to memory.
 
== Rock Ridge and Joliet ==
There are two enhancements for ISO 9660 which make it more suitable for
the worlds of Unix and of MS-Windows. Both can be combined in the same
filesystem. So the reader often has the choice between three file name
spaces: Plain ISO, Rock Ridge, Joliet.
 
ISO and Rock Ridge will show the same tree of files but with different names.
Joliet can show a completely different tree than ISO.
 
Rock Ridge allows for file names of up to 255 characters of 8 bit. Only
the 0-byte and the slash ("/") may not be used. Further it adds the
file attributes which are specified by POSIX (owner, group, permissions,...)
and it allows for symbolic links.
 
Rock Ridge is an application of SUSP. It may be accompanied by other
SUSP applications like zisofs (compression of data files, Linux
specific), Apple ISO 9660 Extensions, Amiga AS entries, or Arbitrary
Attribute Interchange Protocol (AAIP: Extended Attributes and ACLs).
A reader of SUSP entries shall simply ignore all entry types
which it does not expect.
 
Joliet was defined by Microsoft Inc. to allow for filenames with
up to 64 UCS-2 characters (16 bit). It is implemented as separate
tree of Directory Records which begins by a root record in a
Supplementary Volume Descriptor. That descriptor is similar to a
Primary Volume Descriptor, but has a Type Code of 2.
 
== See Also ==
=== Articles ===
* [[El-Torito]], a standard for creating bootable CD-ROMs
* [[Mkisofs]], about ISO 9660 producing programs: mkisofs, genisoimage, xorriso
* [[Optical Drive]], an overview about how to operate optical drives and media
 
=== External links ===
* [http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-119.pdf ISO 9660 (ECMA-119) specification]
* [[wikipedia:ISO 9660|ISO 9960 on Wikipedia]]
* [https://dev.lovelyhq.com/libburnia/libisofs/raw/master/doc/boot_sectors.txt Boot entry points in ISO 9660 filesystems]
* [ftp://ftp.ymi.com/pub/rockridge/susp112.ps SUSP 1.12 (entries CE , PD , SP , ST , ER , ES)]
* [ftp://ftp.ymi.com/pub/rockridge/rrip112.ps Rock Ridge: RRIP 1.12 (SUSP entries PX , PN , SL , NM , CL , PL , RE , TF , SF , obsolete: RR)]
* [http://www.estamos.de/makecd/Rock_Ridge_Amiga_Specific Amiga SUSP entry AS]
* [https://dev.lovelyhq.com/libburnia/libisofs/raw/master/doc/susp_aaip_2_0.txt libisofs SUSP application AAIP (SUSP entry AL)]
* [http://www.buildorbuy.org/pdf/joliet.pdf Joliet addon-on specifications]
 
[[de:ISO9660]]
Anonymous user