ISO 9660: Difference between revisions

Jump to navigation Jump to search
[unchecked revision][unchecked revision]
Content deleted Content added
Major rewrite, added datatype information, put datatypes together
Line 1: Line 1:
{{Filesystems}}
{{Filesystems}}
'''ISO 9660''' is the standard filesystem for CD-ROMs.
'''ISO 9660''' is the standard file system for CD-ROMs.


{{In Progress}}
{{In Progress}}


== Overview and Caveats ==
== Overview and caveats ==
ISO 9660 is not a complex file system, but has a few quirks that are worth remembering. It seems that some operating systems also create non-compliant CD's, so beware! The main example of this is the character set that is available for file names. Strictly, this consists of A-Z (upper case only!), digits and underscores. Many operating systems also allow lower case letters and other characters. Linux's [[VFS]] displays lower case filenames to the user despite the cd contents actually containing upper case characters.
ISO 9660 is not a complex file system, but has a few quirks that are worth remembering. It seems that some operating systems also create non-compliant CDs, so beware! The main example of this is the character set that is available for file names. Strictly, filenames may only consist of uppercase letters A-Z, digits and underscores. Many operating systems also allow lower case letters and other characters. Linux's [[VFS]] displays lower case filenames to the user despite the CD contents actually containing upper case characters.


Another (perhaps little-known and little-utilised) feature of the ISO 9660 file system is that a single file system can span multiple CD's. This is dealt with via "set numbers".
Another (perhaps little-known and little-utilized) feature of the ISO 9660 file system is that a single file system can span multiple CDs. This is dealt with via ''set numbers''.


=== Sector Size ===
=== Sector size ===
An ISO 9660 sector is normally 2KiB long. Although the specification allows for alternative sector sizes, you will rarely find anything other than 2KiB.
An ISO 9660 sector is normally 2 KiB long. Although the specification allows for alternative sector sizes, you will rarely find anything other than 2 KiB.


=== Numerical Formats ===
=== Numerical formats ===
Another quirk of the system is that it has several numbering formats and multi-byte numbers are often represented in ''both-endian'' format. The ISO 9660 standard specifies three ways to encode 16 and 32-bit integers, using either little-endian (least-significant byte first), big-endian (most-significant byte first), or a combination of both (little-endian followed by big-endian). Both-endian (LSB-MSB) fields are therefore twice as wide. For this reason, 32-bit LBA's often appear as 8 byte fields. Where a both-endian format is present, the x86 architecture makes use of the first little-endian sequence and ignores the big-endian sequence.


{| {{wikitable}}
Another quirk of the system is that it has several numbering formats and multi-byte numbers are often represented in "both-endian" format (that is, LSB first followed by MSB first). For this reason, 32 bit LBA's often appear in 8 byte fields. ISO 9660 section 7.3, for example, outlines these 32 bit systems and numbers throughout the documents are referred to as being 7.3.1 format (LSB first), 7.3.2 format (MSB first) or 7.3.3 format (both-endian). Where a both-endian format exists, the x86 architecture makes use of the first 32 bit sequence (the LSB-first sequence) and ignores bytes 4-7.
! Encoding
! Description
|-
| int8 || Unsigned 8-bit integer.
|-
| sint8 || Signed 8-bit integer.
|-
| int16_LSB || Little-endian encoded unsigned 16-bit integer.
|-
| int16_MSB || Big-endian encoded unsigned 16-bit integer.
|-
| int16_LSB-MSB || Little-endian followed by big-endian encoded unsigned 16-bit integer.
|-
| sint16_LSB || Little-endian encoded signed 16-bit integer.
|-
| sint16_MSB || Big-endian encoded signed 16-bit integer.
|-
| sint16_LSB-MSB || Little-endian followed by big-endian encoded signed 16-bit integer.
|-
| int32_LSB || Little-endian encoded unsigned 32-bit integer.
|-
| int32_MSB || Big-endian encoded unsigned 32-bit integer.
|-
| int32_LSB-MSB || Little-endian followed by big-endian encoded unsigned 32-bit integer.
|-
| sint32_LSB || Little-endian encoded signed 32-bit integer.
|-
| sint32_MSB || Big-endian encoded signed 32-bit integer.
|-
| sint32_LSB-MSB || Little-endian followed by big-endian encoded signed 32-bit integer.
|}


=== Permitted Characters ===
The specification refers to two sets of characters: 'a-characters' and 'd-characters'. You will see these terms used in the descriptor tables throughout this article. The character sets are:


=== Date/time format ===
<Pre>

The date/time format used in the ''Primary Volume Descriptor'' is denoted as ''dec-datetime'' and uses ASCII digits to represent the main parts of the date/time:

{| {{wikitable}}
!Offset
!Size
!Datatype
!Description
|-
| 0 || 4 || strD || Year from 1 to 9999.
|-
| 4 || 2 || strD || Month from 1 to 12.
|-
| 6 || 2 || strD || Day from 1 to 31.
|-
| 8 || 2 || strD || Hour from 0 to 23.
|-
| 10 || 2 || strD || Minute from 0 to 59.
|-
| 12 || 2 || strD || Second from 0 to 59.
|-
| 14 || 2 || strD || Hundredths of a second from 0 to 99.
|-
| 16 || 1 || int8 || Time zone offset from GMT in 15 minute intervals, starting at interval -48 (west) and running up to interval 52 (east). So value 0 indicates interval -48 which equals GMT-12 hours, and value 100 indicates interval 52 which equals GMT+13 hours.
|}

All fields except for the offset from GMT are in ASCII digits. When the date and time is not specified, all string fields are ASCII '0' (for a total of 16 ASCII zeroes) and the last field is binary zero.

=== String format ===
Character strings are encoded with ASCII encoding. The specification does not permit all characters. It defines two sets of characters: 'a-characters' and 'd-characters'. You will see these terms used in the descriptor tables throughout this article. The character sets are:

<pre>
a-characters: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 _
a-characters: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 _
! " % & ' ( ) * + , - . / : ; < = > ?
! " % & ' ( ) * + , - . / : ; < = > ?
</pre>

<pre>
d-characters: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 _
d-characters: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 _
</Pre>
</pre>


{| {{wikitable}}
== Volume Descriptors ==
! Encoding
! Description
|-
| strA || String with only ASCII a-characters, padded to the right with spaces.
|-
| strD || String with only ASCII d-characters, padded to the right with spaces.
|}


Note that not all CDs strictly adhere to the character sets specified in ISO 9660.
When preparing to mount a CD, your first action will be reading the volume descriptors (specifically, you will be looking for the Primary Volume Descriptor).

=== Filenames ===
Filenames must be in the 8.3 format (a name of 8 characters, followed by a period, followed by an extension of 3 characters), and use d-character encoding (strD).

== Volume Descriptors ==
When preparing to mount a CD, your first action will be reading the volume descriptors (specifically, you will be looking for the ''Primary Volume Descriptor'').


Sectors 0x00-0x0F of the CD are reserved for system use. This means that the Volume Descriptors can be found starting at sector 0x10. The format of the volume descriptors is as follows:
Sectors 0x00-0x0F of the CD are reserved for system use. This means that the Volume Descriptors can be found starting at sector 0x10. The format of the volume descriptors is as follows:
Line 35: Line 111:
! Offset
! Offset
! Length (bytes)
! Length (bytes)
! Field Name
! Field name
! Datatype
! Description
! Description
|-
|-
| 0 || 1 || Type || Volume Descriptor type code (see below).
| 0 || 1 || Type || int8 || Volume Descriptor type code (see below).
|-
|-
| 1 || 5 || Identifier || Always 'CD001'.
| 1 || 5 || Identifier || strA || Always 'CD001'.
|-
|-
| 6 || 1 || Version || Volume Descriptor Version (0x01).
| 6 || 1 || Version || int8 || Volume Descriptor Version (0x01).
|-
|-
| 7 || 2041 || Data || Depends on the volume descriptor type.
| 7 || 2041 || Data || - || Depends on the volume descriptor type.
|}
|}


This means that each volume descriptor is therefore one sector (2KiB) long.
This means that each volume descriptor is therefore one sector (2 KiB) long.


=== Volume Descriptor Type Codes ===
=== Volume Descriptor Type Codes ===
The Volume Descriptor Type field specifies the type of Volume Descriptor:

{| {{wikitable}}
{| {{wikitable}}
! Value
! Value
! Description
! Description
|-
|-
| 0 || Volume descriptor is a Boot Record
| 0 || Boot Record
|-
|-
| 1 || Primary Volume Descriptor
| 1 || Primary Volume Descriptor
Line 68: Line 145:
|}
|}


When starting out with a basic CD, we are going to be interested in the Primary Volume Descriptor, which points us to the root directory and path tables, which both allow us to find any file on the CD. Using the path table is ideal for minimal implementations which do not wish to search the directory hierarchy node by node. This is slower (string comparisons across the entire filesystem) but easier to implement.
When starting out with a basic CD, we are going to be interested in the Primary Volume Descriptor, which points us to the root directory and path tables, which both allow us to find any file on the CD. Using the path table is ideal for minimal implementations which do not wish to search the directory hierarchy node by node. This is slower (string comparisons across the entire file system) but easier to implement.


=== The Boot Record ===
=== The Boot Record ===

The first type of Volume Descriptor is the "Boot Record". The descriptor format is as follows:
The first type of Volume Descriptor is the "Boot Record". The descriptor format is as follows:


Line 77: Line 153:
! Offset
! Offset
! Length (bytes)
! Length (bytes)
! Field Name
! Field name
! Datatype
! Value
! Description
! Description
|-
|-
| 0 || 1 || Type || 0 || Zero indicates a boot record.
| 0 || 1 || Type || int8 || Zero indicates a boot record.
|-
|-
| 1 || 5 || Identifier || 'CD001' || Always 'CD001'.
| 1 || 5 || Identifier || strA || Always "CD001".
|-
|-
| 6 || 1 || Version || 0x01 || Volume Descriptor Version (0x01).
| 6 || 1 || Version || int8 || Volume Descriptor Version (0x01).
|-
|-
| 7 || 32 || Boot System Identifier || (string) || ID of the system which can act on and boot the system from the boot record in a-characters.
| 7 || 32 || Boot System Identifier || strA || ID of the system which can act on and boot the system from the boot record.
|-
|-
| 39 || 32 || Boot Identifier || (string) || Identification of the boot system defined in the rest of this descriptor in a-characters.
| 39 || 32 || Boot Identifier || strA || Identification of the boot system defined in the rest of this descriptor.
|-
|-
| 71 || 1977 || Boot System Use || Unspecified || Custom - used by the boot system.
| 71 || 1977 || Boot System Use || - || Custom - used by the boot system.
|}
|}


=== The Primary Volume Descriptor ===
=== The Primary Volume Descriptor ===

This is a lengthy descriptor, but it contains some very useful information for reading the rest of the file system.
This is a lengthy descriptor, but it contains some very useful information for reading the rest of the file system.


Line 101: Line 176:
! Offset
! Offset
! Length (bytes)
! Length (bytes)
! Field Name
! Field name
! Datatype
! Description
! Description
|-
|-
| 0 || 1 || Type Code || Always 0x01 for a Primary Volume Descriptor.
| 0 || 1 || Type Code || int8 || Always 0x01 for a Primary Volume Descriptor.
|-
|-
| 1 || 5 || Standard Identifier || Always 'CD001'.
| 1 || 5 || Standard Identifier || strA || Always 'CD001'.
|-
|-
| 6 || 1 || Version || Always 0x01.
| 6 || 1 || Version || int8 || Always 0x01.
|-
|-
| 7 || 1 || Unused || Always 0x00.
| 7 || 1 || Unused || - || Always 0x00.
|-
|-
| 8 || 32 || System Identifier || The name of the system that can act upon sectors 0x00-0x0F for the volume in a-characters.
| 8 || 32 || System Identifier || strA || The name of the system that can act upon sectors 0x00-0x0F for the volume.
|-
|-
| 40 || 32 || Volume Identifier || Identification of this volume in d-characters.
| 40 || 32 || Volume Identifier || strD || Identification of this volume.
|-
|-
| 72 || 8 || Unused Field ||
| 72 || 8 || Unused Field || - || All zeroes.
|-
|-
| 80 || 8 || Volume Space Size || Number of Logical Blocks in which the volume is recorded. This is a 32 bit value in both-endian format.
| 80 || 8 || Volume Space Size || int32_LSB-MSB || Number of Logical Blocks in which the volume is recorded.
|-
|-
| 88 || 32 || Unused Field ||
| 88 || 32 || Unused Field || - || All zeroes.
|-
|-
| 120 || 4 || Volume Set Size || The size of the set in this logical volume (number of disks). This is a 16 bit value in both-endian format.
| 120 || 4 || Volume Set Size || int16_LSB-MSB || The size of the set in this logical volume (number of disks).
|-
|-
| 124 || 4 || Volume Sequence Number || The number of this disk in the Volume Set. This is a 16 bit value in both-endian format.
| 124 || 4 || Volume Sequence Number || int16_LSB-MSB || The number of this disk in the Volume Set.
|-
|-
| 128 || 4 || Logical Block Size || The size in bytes of a logical block in both-endian format. NB: This means that a logical block on a CD could be something other than 2KiB!
| 128 || 4 || Logical Block Size || int16_LSB-MSB || The size in bytes of a logical block. NB: This means that a logical block on a CD could be something other than 2 KiB!
|-
|-
| 132 || 8 || Path Table Size || The size in bytes of the path table in 32 bit both-endian format.
| 132 || 8 || Path Table Size || int32_LSB-MSB || The size in bytes of the path table.
|-
|-
| 140 || 4 || Location of Type-L Path Table || LBA location of the path table, recorded in LSB-first (little endian) format. The path table pointed to also contains LSB-first values.
| 140 || 4 || Location of Type-L Path Table || int32_LSB || LBA location of the path table. The path table pointed to contains only little-endian values.
|-
|-
| 144 || 4 || Location of the Optional Type-L Path Table || LBA location of the optional path table, recorded in LSB-first (little endian) format. The path table pointed to also contains LSB-first values. Zero means that no optional path table exists.
| 144 || 4 || Location of the Optional Type-L Path Table || int32_LSB || LBA location of the optional path table. The path table pointed to contains only little-endian values. Zero means that no optional path table exists.
|-
|-
| 148 || 4 || Location of Type-M Path Table || LBA location of the path table, recorded in MSB-first (big-endian) format. The path table pointed to also contains MSB-first values.
| 148 || 4 || Location of Type-M Path Table || int32_MSB || LBA location of the path table. The path table pointed to contains only big-endian values.
|-
|-
| 152 || 4 || Location of Optional Type-M Path Table || LBA location of the optional path table, recorded in MSB-first (big-endian) format. The path table pointed to also contains MSB-first values.
| 152 || 4 || Location of Optional Type-M Path Table || int32_MSB || LBA location of the optional path table. The path table pointed to contains only big-endian values. Zero means that no optional path table exists.
|-
|-
| 156 || 34 || Directory entry for the root directory. || Note that this is not an LBA address, it is the actual Directory Record, which contains a zero-length Directory Identifier, hence the fixed 34 byte size.
| 156 || 34 || Directory entry for the root directory || - || Note that this is not an LBA address, it is the actual Directory Record, which contains a zero-length Directory Identifier, hence the fixed 34 byte size.
|-
|-
| 190 || 128 || Volume Set Identifier || Identifier of the volume set of which this volume is a member in d-characters.
| 190 || 128 || Volume Set Identifier || strD || Identifier of the volume set of which this volume is a member.
|-
|-
| 318 || 128 || Publisher Identifier || The volume publisher in a-characters. If unspecified, all bytes should be 0x20. For extended publisher information, the first byte should be 0x5F, followed by an 8.3 format file name. This file must be in the root directory and the filename is made from d-characters.
| 318 || 128 || Publisher Identifier || strA || The volume publisher. For extended publisher information, the first byte should be 0x5F, followed by the filename of a file in the root directory. If not specified, all bytes should be 0x20.
|-
|-
| 446 || 128 || Data Preparer Identifier || The identifier of the person(s) who prepared the data for this volume. Format as per Publisher Identifier.
| 446 || 128 || Data Preparer Identifier || strA || The identifier of the person(s) who prepared the data for this volume. For extended preparation information, the first byte should be 0x5F, followed by the filename of a file in the root directory. If not specified, all bytes should be 0x20.
|-
|-
| 574 || 128 || Application Identifier || Identifies how the data are recorded on this volume. Format as per Publisher Identifier.
| 574 || 128 || Application Identifier || strA || Identifies how the data are recorded on this volume. For extended information, the first byte should be 0x5F, followed by the filename of a file in the root directory. If not specified, all bytes should be 0x20.
|-
|-
| 702 || 38 || Copyright File Identifier || Identifies a file containing copyright information for this volume set. The file must be contained in the root directory and is in 8.3 format. If no such file is identified, the characters in this field are all set to 0x20.
| 702 || 38 || Copyright File Identifier || strD || Filename of a file in the root directory that contains copyright information for this volume set. If not specified, all bytes should be 0x20.
|-
|-
| 740 || 36 || Abstract File Identifier || Identifies a file containing abstract information for this volume set in the same format as the Copyright File Identifier field.
| 740 || 36 || Abstract File Identifier || strD || Filename of a file in the root directory that contains abstract information for this volume set. If not specified, all bytes should be 0x20.
|-
|-
| 776 || 37 || Bibliographic File Identifier || Identifies a file containing bibliographic information for this volume set. Format as per the other File Identifier fields.
| 776 || 37 || Bibliographic File Identifier || strD || Filename of a file in the root directory that contains bibliographic information for this volume set. If not specified, all bytes should be 0x20.
|-
|-
| 813 || 17 || Volume Creation Date and Time || Date and Time format as specified below.
| 813 || 17 || Volume Creation Date and Time || datetime || The date and time of when the volume was created.
|-
|-
| 830 || 17 || Volume Modification Date and Time || Date and Time format as specified below.
| 830 || 17 || Volume Modification Date and Time || datetime || The date and time of when the volume was modified.
|-
|-
| 847 || 17 || Volume Expiration Date and Time || Date and Time format as specified below. After this date and time, the volume should be considered obsolete. If unspecified, then the information is never considered obsolete.
| 847 || 17 || Volume Expiration Date and Time || datetime || The date and time after which this volume is considered to be obsolete. If not specified, then the volume is never considered to be obsolete.
|-
|-
| 864 || 17 || Volume Effective Date and Time || Date and Time format as specified below. Date and time from which the volume should be used. If unspecified, the volume may be used immediately.
| 864 || 17 || Volume Effective Date and Time || datetime || The date and time after which the volume may be used. If not specified, the volume may be used immediately.
|-
|-
| 881 || 1 || File Structure Version || An 8 bit number specifying the directory records and path table version (always 0x01).
| 881 || 1 || File Structure Version || int8 || The directory records and path table version (always 0x01).
|-
|-
| 882 || 1 || Unused || Always 0x00.
| 882 || 1 || Unused || - || Always 0x00.
|-
|-
| 883 || 512 || Application Used || Contents not defined by ISO 9660.
| 883 || 512 || Application Used || - || Contents not defined by ISO 9660.
|-
|-
| 1395 || 653 || Reserved || Reserved by ISO.
| 1395 || 653 || Reserved || - || Reserved by ISO.
|}
|}


==== Date and Time Format ====
=== Volume Descriptor Set Terminator ===


The Volume Descriptor Set Terminator does not currently define bytes 7-2047 of its Volume Descriptor. This means that the only fields in use for the volume set terminator are the type code (255), the standard identifier ('CD001') and the descriptor version (0x01).
The date / time format used in the Primary Volume Descriptor is:


{| {{wikitable}}
{| {{wikitable}}
!Offset
! Offset
! Length (bytes)
!Size
! Field name
!Description
! Datatype
! Description
|-
|-
| 0 || 4 || Year from 1 to 9999.
| 0 || 1 || Type || int8 || 255 indicates a Volume Descriptor Set Terminator.
|-
|-
| 4 || 2 || Month from 1 to 12.
| 1 || 5 || Identifier || strA || Always "CD001".
|-
|-
| 6 || 2 || Day from 1 to 31.
| 6 || 1 || Version || int8 || Volume Descriptor Version (0x01).
|-
| 8 || 2 || Hour from 0 to 23.
|-
| 10 || 2 || Minute from 0 to 59.
|-
| 12 || 2 || Second from 0 to 59.
|-
| 14 || 2 || Hundredths of a second from 0 to 99.
|-
| 16 || 1 || Offset from GMT in 15 minute intervals from -48 (West) to +52 (East)
|}
|}

All fields except for the offset from GMT are in ASCII digits. An unspecified date and time is represented by 16 '0' digits, followed by a zero in the last field.


=== Volume Descriptor Set Terminator ===

The Volume Descriptor Set Terminator does not currently define bytes 7-2047 of its Volume Descriptor. This means that the only fields in use for the volume set terminator are the type code (255), the standard identifier ('CD001') and the descriptor version (0x01).


== The Path Table ==
== The Path Table ==