CD-ROM Digital Data (CD-ROM, ISO 9660, "High Sierra")

The standard that describes how digital data are to be recorded on compact disk media went through several different iterations before the format was finalized. The first step was the creation of the original data format standard, called the "yellow book", by Philips and Sony in 1983. This specification was based on the original "red book" format that was the basis for CD digital audio disks.

The "yellow book" specification was unfortunately general enough that it was feared that many different companies would implement proprietary data storage formats using this spec, resulting in many different incompatible data CDs. To try to prevent this, representatives of major manufacturers met at the High Sierra Hotel and Casino in Lake Tahoe, NV, in 1985, to come together on a common standard for data CDs. This format was nicknamed High Sierra Format. It was later modified slightly and adopted as ISO standard 9660.

Today, the terms "yellow book", High Sierra and ISO 9660 are used somewhat interchangeably to refer to standard data CDs, although the most common name is simply: "CD-ROM". This isn't technically precise, but the important thing is that virtually all data CDs that are in use today are standardized and will work in all standard CD-ROM drives, which was the main objective of all of this, of course. I will call this format simply "data CD" for the rest of this section, for simplicity.

Under the data CD standard, there are two modes defined:

  • Mode 1: This is the standard data storage mode used by virtually all standard data CDs. The data is laid out in basically the same way as it is in standard audio CD format, except that the 2,352 bytes of data in each block are broken down further. 2,048 of these bytes are for "real" data. The other 304 bytes are used for an additional level of error detecting and correcting code. This is necessary because data CDs cannot tolerate the loss of a handful of bits now and then, the way audio CDs can.
  • Mode 2: Mode 2 data CDs are the same as mode 1 CDs except that the error detecting and correcting codes are omitted. Hmm. Why would they define this then, when it is so similar to standard audio CDs? The reason is that mode 2 format provides a more flexible vehicle for storing types of data that do not require high data integrity: for example, graphics and video can use this format. Furthermore, different kinds can be mixed together; this is the basis for the extensions to the original data CD standards known as CD-ROM Extended Architecture, or CD-ROM XA.

The rest of this section is concerned with "plain vanilla" CD-ROM data disks, which are mode 1 under the ISO 9660 standard. Each block contains 2,048 bytes of real data. As with the audio format, there are 75 blocks per "second" of the disk, so on a standard 74 minute compact disk, this yields a total capacity of 681,984,000 bytes, which is the same as the commonly-heard 650 MB (actually 650.39 binary MB). Since the disk is designed to allow the reading of 75 blocks per second, this is the basis for the standard single-speed transfer rate of 75 * 2,048 = 150 KB per second. Of course, faster CD-ROM drives transfer at much higher rates.

Much the way a hard disk or floppy disk has a file allocation table and root directory to identify the place to look in order to find the various directories and files on the disk, a data CD needs this "starting point" as well. At the start of the CD, a table of contents lists what is on the disk and where to find it. Newer CD formats that are said to be "multi-session" can have more than one set of data on the disk, recorded at different times, and therefore use multiple tables of contents, one per session. The table of contents is also sometimes called the index of the disk.

