Kees A. Schouhamer Immink
Turing
Machines Inc, Rotterdam, The Netherlands,
Institute
for Experimental Mathematics, Essen, Germany.
Abstract – An audio compact disc (CD)
holds up to 74 minutes, 33 seconds of sound, just enough for a complete mono
recording of Ludwig von Beethoven's Ninth Symphony (‘Alle Menschen werden
Brüder’) at probably the slowest pace it has ever been played, during
the Bayreuther Festspiele in 1951 and conducted by Wilhelm Furtwängler. Each
second of music requires about 1.5 million bits, which are represented as tiny
pits and lands ranging from 0.9 to 3.3 micrometers in length. More than 19
billion channel bits are recorded as a spiral track of alternating pits and
lands over a distance of 5.38 kilometers (3.34 miles), which are scanned at
walking speed, 4.27 km per hour.
This year it is 25 years ago that
Philips and Sony introduced the CD. In this jubilee article we will discuss the
various crucial technical decisions made that would determine the technical
success or failure of the new medium.
In 1973, I started my work on servo systems and electronics
for the videodisc in the Optics group of Philips Research in Eindhoven. The
videodisc is a 30 cm diameter optical disc that can store up to 60 minutes of
analog
FM-modulated video and sound. It is like a DVD, but much
larger, heavier, and less reliable. The launch of the videodisc in 1975 was a
technical success, but a monumental marketing failure since the consumers
showed absolutely no interest at all. After two years, Philips decided to throw
in the towel, and they withdrew the product from the market.

While
my colleagues and I were working on the videodisc, two Philips engineers were
asked to develop an audio-only disc based on optical videodisc technology. The
two engineers were recruited from the audio department, since my research
director believed a sound-only disc was a trivial matter given a video and
sound videodisc, and he refused to waste costly researcher’s time. In
retrospect, given the long forgotten videodisc and the CD’s great success, this
seems a remarkable decision.
The audio
engineers started by experimenting with an analog approach using wide-band
frequency modulation as in FM radio. Their experiments revealed that the analog
solution was scarcely more immune to dirt and scratches than a conventional
analog LP. Three years later they decided to look for a digital solution. In
1976 and later, Philips and Sony independently demonstrated the first
prototypes of a digital disc using laser videodisc technology. In 1977, Sony
completed a prototype with a 30 cm diameter disc, the same as the videodisc,
and 60 minutes playing time [2]. In October 1979, a crucial high-level decision
was made to join forces in the development of a world audio disc standard.
Philips and Sony, although competitors in many areas, shared a long history of
cooperation, for instance in the joint establishment of the compact cassette
standard in the 1960's. In marketing the final products, however, both firms
would compete against each other again. Philips brought its expertise and the
huge videodisc patent portfolio to the alliance, and Sony contributed its
expertise in digital audio technology. In addition, both firms had a
significant presence in the music industry via CBS/Sony, a joint venture
between CBS Inc. and Sony Japan Records Inc. dating from the late 1960s, and
Polygram, a 50% subsidiary of Philips [4].
Within a few weeks, a joint task force of experts was
formed. As the only electronics engineer within the ‘Optics’ research group, I
participated and dealt with servos, coding, and electronics at large. In 1979
and 1980, a number of meetings, alternating between Tokyo and Eindhoven, were
held. The first meeting, in August 1979 in Eindhoven, and the second meeting,
in October 1979 in Tokyo, provided an opportunity for the engineers to get to
know each other and to learn each other’s main strengths. Both companies had
shown prototypes and it was decided to take the best of both worlds. During the
third technical meeting on December 20, 1979, both partners wrote down their
list of preferred main specifications for the audio disc.
Although there are many other specifications, such as the
dimensions of the pits, disc thickness, diameter of the inner hole, etcetera,
these are too technical to be discussed here.
As can be seen from the list, a lot of work had to be done as the partners agreed only on one item, namely the one-hour playing time. The other target parameters, sampling rate, quantization, and notably disc diameter look very similar, but were worlds apart.
|
Item |
Philips |
Sony |
|
Sampling rate (kHz) |
44.0 - 44.5 |
44.1 |
|
Quantization |
14 bit |
16 bit |
|
Playing time (min) |
60 |
60 |
|
Diameter (mm) |
115 |
100 |
|
EC Code |
t.b.d. |
t.b.d. |
|
Channel Code |
M3 |
t.b.d. |
t.b.d. = to be discussed
The Shannon-Nyquist sampling theorem dictates that in order to achieve
lossless sampling, the signal should be sampled with a frequency at least twice
the signal’s bandwidth. So for a bandwidth of 20 kHz a sampling frequency of at
least 40 kHz is required. A large number of people, especially young people,
are perfectly capable of hearing sounds at frequencies well above 20 kHz. That
is, in theory, all we can say. In 1978, each and every piece of digital audio
equipment used its own ‘well-chosen’ sampling frequency ranging from 32 to 50
kHz. Modern digital audio equipment accepts many different sampling rates, but
the CD task force opted for only one frequency, namely 44.1 kHz. This sampling
frequency was chosen mainly for logistics reasons as will be discussed later,
once we have explained the state-of-the-art of digital audio recording in 1979.
Towards the end of the 1970s, ‘PCM adapters’ were developed in Japan,
which used ordinary analog video tape recorders as a means of storing digital
audio data, since these were the only widely available recording devices with
sufficient bandwidth. The best commonly-available video recording format at the
time was the 3/4" U-Matic.
The presence of the PCM video-based adaptors explains the
choice of sampling frequency for the CD, as the number of video lines, frame
rate, and bits per line end up dictating the sampling frequency one can achieve
for storing stereo audio. The sampling frequencies of 44.1 and 44.056 kHz were
the direct result of a need for compatibility with the NTSC and PAL video
formats. Essentially, since there were no other reliable recording products
available at that time that offered other options in sampling rates, the
Sony/Philips task force could only choose between 44.1 or 44.056 KHz and 16 bits
resolution (or less).
During the fourth meeting held in Tokyo from March 18-19, 1980, Philips
accepted, and thus followed Sony’s original proposal, the 16-bit resolution and
the 44.1 kHz sampling rate. 44.1 kHz as opposed to 44.056 kHz was chosen for the
simple reason that it was easier to remember. Philips dropped their wish to use
14 bits resolution: they had no technical rationale as the wish for the 14 bits
was in fact only based on the availability of their 14-bit digital-analog
converter. In other words, the Compact Disc sound quality equals the sound
quality of Sony’s PCM-1600 adaptor.
Thus, quite remarkably, in recording practice, an audio CD starts life
as a PCM master tape, recorded on a U-Matic videotape cassette, where the audio
data is converted to digital information superimposed within a standard
television signal. The industry standard hardware to do this was the Sony
PCM-1600, the first commercial video-based 16-bit recorder, followed by the
PCM-1610 or PCM-1630 adaptors. Until the 1990s, only video cassettes could be
used as a means for exchanging digital sound from the studios to the CD
mastering houses. Later, Exabyte computer tapes, CD-Rs and memory sticks have
been used as a transport vehicle.
Coding systems
Coding techniques form the basis of modern digital
transmission and storage systems. There had been previous practical
applications of coding, especially in space communications, but the Compact
Disc was the first mass-market electronics product equipped with fully-fledged
error correction and channel coding systems. To gain an idea of the types of
errors, random versus burst errors, burst length distribution and so on, we
made discs that contained known coded sequences. Burst error length
distributions were measured for virgin, scratched, or dusty discs. The error
measurement was relatively simple, but scratching or fingerprinting a disc in
such a way that it can still be played is far from easy. How do you get a disc
with the right kind of sticky dust? During playing, most of the dust fell off
the disc into the player, and the optics engineers responsible for the player
were obviously far from happy with our dust experiments. The experimental discs
we used were handmade, and not pressed as commercial mass-produced polycarbonate discs. In retrospect, I think that
the channel characterization was a far from adequate instrument for the design
of the error correction control (ECC).
There were only two competing ECC proposals to be studied.
Experiments in Tokyo and Eindhoven
-Japanese dust was not the same as Dutch dust- were conducted to verify the performance of the two proposed
ECCs. Sony proposed a byte-oriented, rate 3/4, Cross lnterleaved Reed-Solomon
code (CIRC) [6]. Vries of Philips designed an interleaved convolutional, rate
2/3, code having a basic unit of information of 3-bit characters [9]. CIRC uses
two short RS codes, namely (32, 28, 5) and (28, 24, 5) RS codes using a
Ramsey-type of interleaver. If a major burst error occurs and the ECC is overloaded,
it is possible to obtain an approximation of an audio sample by interpolating
the neighboring audio samples, so concealing uncorrectable samples in the audio
signal. CIRC has various nice features to make error concealment possible, so
extending the player's operation range [10].
CIRC showed both a much higher performance and code rate (and thus
playing time), although extremely complicated to cast into silicon at the time.
Sony used a 16 kByte RAM for data interleaving, which, then, cost around $50, and
added significantly to the sales price of the player. During the fifth meeting
in Eindhoven, May 1980, the partners agreed on the CIRC error correction code
since our experiments had shown its great resilience against mixtures of random
and burst errors [11]. The fully correctable burst length is about 4.000 bits
(around 1.5 mm missing data on the disc). The length of errors that can be
concealed is about 12.000 bits (around 7.5 mm). However, the largest error
burst we ever measured during the many long days of disc channel
characterization was 0.1 mm.
We also had to decide on the channel code. This is a vital
component as it has a considerable impact on both the playing time and the
quality of ‘disc handling’ or 'playability'. Servo systems follow the track of
alternating pits and lands in three dimensions, namely radial, focal, and
rotational speed. Everyday handling damage, such as dust, fingerprints, and
tiny scratches, not only affects retrieved data, but also disrupts the servo
functions. In worst cases, the servos may skip tracks or get stuck, and error
correction systems become utterly worthless. A product with such devastating
weaknesses would remain a laboratory toy. A well-designed channel code will
make it possible to remove the major barriers related to these playability
issues.
The system designer should find a good trade-off between
long playing time and playability. Both partners proposed some form of (d, k)
runlength-limited (RLL) codes, where d is the minimum number and k
is the maximum number of zeros between consecutive ones. The differences
between the various proposals were the code rate, runlength parameters d
and k, and the spectral content. The spectral content has a direct
bearing on the playability. In their prototype, Philips used the propriety M3
channel code, a rate ½, d=1, k=5 code, with a well-suppressed
spectral content [1]. M3 is a variation on the M2 code, which was developed in
the 1970s by Ampex Inc. for their digital video tape recorder [5]. Sony started
with a rate 1/3, d=5, RLL code, but since that did not work, they
changed horses halfway, and proposed a propriety rate ½, d=2, k=7
code, a type of code that had been used in magnetic disk data storage. Both
Sony codes did not have spectral suppression, and the engineers had opposing
views on how the servo and synchronization issue could be solved. In May 1980,
the choice of the channel code therefore remained open, and ‘more study was
needed’. Before continuing with the coding cliffhanger, we take a musical
break.
Playing time and disc diameter are probably the parameters
most visible for consumers. Clearly, these two are related: a 5% increase in
disc diameter yields 10% more disc area, and thus an increase in playing time
of 10%. The Philips’ top made the proposal regarding the disc diameter. They
argued 'The Compact Audio Cassette was a great success', and, 'we don't think
CD should be much larger'. The cross diameter of the Compact Audio Cassette,
very popular at that time and also developed by Philips, is 115 mm. The Philips
prototype audio disc and player were based on this idea, and the Philips team
of engineers restated this view in the list of preferred main parameters. Sony, no doubt with portable players in mind, initially
preferred a smaller 100 mm disc.
During the May 1980 meeting something remarkable happened.
The minutes of the May 1980 meeting in Eindhoven literally reads:
disc diameter: 120
mm,
playing time: 75
minutes,
track pitch: 1.45
µm,
can be achieved with the Philips M3 channel code. However,
the negative points are: large numerical aperture needed which entails smaller
(production) margins, and the Philips’ M3 code might infringe on Ampex M2.
Both disc diameter and playing time differ significantly
from the preferred values listed during the Tokyo meeting in December 1979. So
what happened during the six months? The minutes of the meetings do not give
any clue as to why the changes to playing time and disc diameter were made.
According to the Philips’ website with the ‘official’ history: "The
playing time was determined posthumously by Beethoven". The wife of Sony's
vice-president, Norio Ohga, decided that she wanted the composer's Ninth
Symphony to fit on a CD. It was, Sony’s website explains, Mrs. Ohga's favorite
piece of music. The Philips’ website proceeds:
“The performance by the Berlin Philharmonic, conducted by
Herbert von Karajan, lasted for 66 minutes. Just to be quite sure, a check was
made with Philips’ subsidiary, Polygram, to ascertain what other recordings
there were. The longest known performance lasted 74 minutes. This was a mono
recording made during the Bayreuther Festspiele in 1951 and conducted by
Wilhelm Furtwängler. This therefore became the maximum playing time of a CD. A
diameter of 120 mm was required for this playing time”.
Everyday practice is less romantic than the pen of a public
relations guru, as at that time, Philips’ subsidiary Polygram –one of the
world's largest distributors of music– had set up a CD disc plant in Hanover,
Germany. This could produce large quantities CDs with of course, a diameter of
115mm. Sony did not have such a facility yet. If Sony had agreed on the 115mm
disc, Philips would have had a significant competitive edge in the music
market. Sony was aware of that, did not like it, and something had to be done.
It was not about Mrs. Ohga’s great passion for music, but the money and competition in the market of the two
partners. The decision regarding diameter/playing time was taken outside
of the group of experts responsible for the CD format. So I, a former member of
that group, can only guess what happened at the upper floor. But something
unforeseen happened: at the last minute
we changed the code.
Popular literature, as exemplified in Philips’ website
mentioned above, states that the disc diameter is a direct result of the
requested playing time. And that the extra 14
minutes playing time for Furtwängler’s Ninth subsequently required the change from 115mm to a 120 mm disc. It suggests
that there are no other factors affecting playing time. Note that in May
1980, when disc diameter and playing time were agreed, the channel code, a key
factor affecting playing time, was not yet settled. In the minutes of the May
1980 meeting, it was remarked that the above (diameter, playing time, and track
pitch) could be achieved with Philips' M3 channel code. In the mean time, but
not mentioned in the minutes of the May meeting, the author was experimenting
with a new channel code, later coined EFM [3]. EFM, a rate 8/17, d=2,
code made it possible to achieve a 30 percent higher information density than
the Philips' M3. Due to its good spectral suppression, EFM also showed a good
resilience against disc handling damage such as fingerprints, dust, and
scratches. Note that 30 percent efficiency improvement is highly attractive,
since, for example, the disc diameter increase from 115 to 120 mm only offers a
mere10 percent increase in playing time.
A month later, in June 1980, we could not choose the channel
code, and again more study and experiments were needed. Although experiments
had shown the greater information density that could be obtained with EFM, it
was at first merely rejected. At the end of the discussion, which at times was
heated, the Sony people were specifically opposing the complexity of the EFM
decoder, which then required 256 gates. My remark that the CIRC decoder needed
at least half a million gates and that the extra 256 gates for EFM were
irrelevant was jeered at. Then suddenly, during the meeting, we received a
phone call from the presidents of Sony and Philips, who were meeting in Tokyo.
We were running out of time, they said, and one week for an extra, final,
meeting in Tokyo was all the lads could get. Sony stated that if the EFM
hardware would be less than 80 gates, they would accept it. I had a week to
reduce the gate count. I used the first Apple II computer in the lab, which was
much handier for such an interactive design using trial and error than the IBM
mainframe, especially as I had to walk to the IBM computer center for every
job. I succeeded in bringing the gate count down to just 52 gates, and on June
19, 1980 in Tokyo, Sony agreed to EFM. The 30 percent extra information density
offered by EFM could have been used to reduce the diameter to 115mm or even
100mm, (with, of course, the requested 74 minutes and 33 seconds for playing
Mrs. Ohga’s favorite Ninth). However such a change was not considered to be
politically feasible, as the powers to be had decided 120mm. The option to
increase the playing time to 97 minutes was not even considered. We decided to
improve the production margins of player and disc by lowering the information
density by 30 percent: the disc diameter remained 120mm, the track pitch was
increased from 1.45 to 1.6µm, and the user bit length was increased from 0.5 to
0.6µm. By increasing the bit size in two dimensions, in a similar vein to large
letters being easier to read, the disc was easier to read, and could be
introduced without too many technical complications.
The maximum playing time of the CD was settled at 74 minutes
and 33 seconds, but in practice, however, the maximum playing time was
determined by the playing time of the U-Matic video recorder, which was 72
minutes. Therefore, rather sadly, Mrs. Ohga’s favorite Ninth by Furtwängler
could not be recorded in full on a single CD till 1988, when alternative
digital transport media became available. On a slightly different note, Jimi Hendrix's Electric Ladyland featuring a playing time
of 75 minutes was originally released as a 2 CD set in the early 1980s, but has
been on a single CD since 1997.
The Sony/Philips task force stood on the shoulders of the
Philips’ engineers who created the laser videodisc technology in the 1970s.
Given the videodisc technology, the task force made choices regarding various
mechanical parameters such as disc diameter, pit dimensions, and audio
parameters such as sampling rate and resolution. In addition, two basic patents
were filed related to error correction, CIRC, and channel code, EFM. CIRC, the
Reed-Solomon ECC format, was completely engineered and developed by Sony
engineers, and EFM was completely created and developed by the author.
Let us take a look at the numbers. The size of the task
force varied per meeting, and the average number of attendees listed on the
minutes of the joint meetings is twelve. If the persons carrying hierarchical
responsibility of the CD project are excluded then we find a very small group
of engineers who carried the technical responsibility of the Compact Disc ‘Red
Book’ standard.
Philips' corporate public relations department, see The
Inventor of the CD on Philips' website [7], states that the CD was "too
complex to be invented by a single individual", and the "Compact Disc
was invented collectively by a large group of people working as a team".
It persuades us to believe that progress is the product of institutions, not
individuals. Evidently, there were battalions of very capable engineers, who
further developed and marketed the product, and success in the market depended
on many other innovations. For example, the solid-state physicists, who
developed an inexpensive laser diode, a primary enabling technology, made CD
possible in practice. Credit should also be given to the persons who designed
the transparent Compact Disc storage case, the ' jewel box', made a clever
contribution to the visual appeal of the CD.
Philips and Sony agreed in a memorandum dated June 1980,
that their contributions to channel and error correction codes are equal.
Sony’s website with their 'official' history is entitled 'Our contributions
are equal' [8]. The website proceeds, “We
avoid such comments as, ‘We developed this part and that part’ and to emphasize
that the disc's development was a joint effort by saying, ‘Our contributions
are equal’. The leaders of the task force convinced the engineers to put their
companies before individual achievements.”
The myth building even went so far that the patent applications for both
CIRC and EFM were filed with joint Sony/Philips inventors.
Everything else is
gaslight
A favorite expression of audiophiles
–particularly during the early period, when they were comparing both vinyl LP
and CD versions of the same recordings– was: "It is as though a veil has
been lifted from the music". Or, in the words of the famous Austrian
conductor Herbert von Karajan, when he first heard CD audio: "Everything
else is gaslight". Von Karajan was fond of the gaslight metaphor: he
first conducted Der Rosenkavalier in 1956 with the soprano Elisabeth
Schwarzkopf. Later, when he revived the opera in 1983 with Anna Tomowa, he
referred to his 1956 cast as "gaslight", which rather upset
Schwarzkopf.
Philips and Sony
settled the introduction of the new product to be on November 1, 1982. The
moment the ink of the “Red Book”, detailing the CD specifications, was dry, the
race started, and hundreds of developers in Japan and the Netherlands were on
their way.
Early January 1982 it became clear that
Philips was running behind, the electronics was seriously delayed, and they
asked Sony to postpone the introduction. Sony
rejected the delay, but agreed upon a
two-step launch. Sony would first market their CD players and discs in Japan,
where Philips had no market share, and half a year later, March 1983, the
worldwide introduction would take place by Philips and Sony. Philips Polygram
could supply discs for the Japanese market. This gave Philips some breathing
space for the players, but not enough as in order to make the new deadline, the
first generation of Philips CD players was equipped with Sony electronics.
The first CD players cost over $2000, but just two years
later it was possible to buy them for under $350. Five years after the introduction, sales of CD were higher than vinyl LPs.
Yet this was no great achievement, as in 1980 sales of vinyl records had been
declining for many years although the music industry was all but dead. A few
years later, the Compact Disc had completely replaced the vinyl LP and
cassette tape. Compact Disc technology was ideal for use as a low-cost,
mass-data storage medium, and the CD-ROM and record-once and re-writable media,
CD-R and CD-RW, respectively, were developed. In 1995, the CD was succeeded by
DVD, which offers a six-fold higher storage capacity. Now, 25 years after the
introduction of the CD, home cinema on DVD accounts for 70 percent of
Hollywood's worldwide film revenue. DVD has replaced VHS videotape. Hundreds of
millions of players and more than two hundred billion CD audio discs have been
sold.
Acknowledgement
The author warmly acknowledges the hospitality of the
Rotterdam Radio Museum, where the photos of classic CD players were made.
Further reading
[1] M.G. Carasso, W.J.
Kleuters, and J.J Mons,
Method of coding data bits on a
recording medium (M3 Code), US Patent 4,410,877, 1983.
[2] T. Doi, T. Itoh, and
H. Ogawa, A Long-Play Digital
Audio Disk System, AES Preprint 1442, Brussels, Belgium, March
1979.
[3] K.A.S. Immink and
H. Ogawa, Method for
Encoding Binary Data (EFM), US Patent 4,501,000,
1985.
[4] T. Kretschmer and K. Muehlfeld, Co-opetition
in Standard-Setting: The Case of the Compact Disc.
[5] J.W. Miller, DC Free encoding for data transmission (M2 Code), US Patent 4,234,897, 1980.
[6]
K. Odaka, Y. Sako, I. Iwamoto, T. Doi, and L. Vries,
Error correctable data
transmission method (CIRC), US
Patent 4,413,340, 1983.
[8] Our
contributions are equal, Sony’s historical website.
[9] L.B. Vries, The Error Control System of Philips Compact
Disc, AES Preprint 1548, New York, Nov. 1979.
[10] K.A.S. Immink, ''Reed-Solomon Codes and the Compact Disc''
in S.B. Wicker and V.K. Bhargava,
Eds., Reed-Solomon Codes and Their Applications, IEEE Press, 1994.
[11] J. Nathan, SONY, The private life, Houghton
Mifflin Co, 1999, pp. 140. Quote:
‘At the final session of the first round, held in Tokyo March 1980, the two teams tested one another’s error correction systems on discs that had been scratched, marked with fingerprints, even dusted with chalk. The Philips system proved inadequate to the extreme conditions, and Sony was judged the winner. There were protests from the Philips team that the test conditions were extreme, but their manager agreed that the test had been fair, and that the Sony error correction mechanism was adopted.’
[12] K.A.S. Immink ‘The CD Story’,
AES Journal, pp. 458-465, May 1998.