There are several attributes that determine the quality and size of a digital audio file. They are the , the , the number of , and the .

The sampling rate is the number of times, per second, that the amplitude level (or state) is captured. It is measured in Hertz (seconds-1, Hz). A high sampling rate results in high quality digital sound in the same way that high resolution video shows better picture quality. Compact disks, for example, use a sampling rate of 44100Hz, whereas telephone systems use a rate of only 8000Hz. If you've ever heard music on the telephone while on hold, you'll notice a big difference in quality when compared to the original music played on a CD player.

Higher sampling rates capture a wider range of and maintain a smoother waveform. The figure below shows a real world waveform in red and the digital waveform in black at different sampling rates. You can see that increasing the sampling rate makes each step of the digital waveform narrower. The shape more closely follows the real world. In general, the height of each step is reduced as well, but that depends on the number of . In simple terms, the sampling rate controls the width of each step. Figure: Sampling Rate
Sampling Rate

The rate to use depends upon the type of sound and the amount of storage space available. Higher rates consume a lot of space. In the above example, the CD requires over 5 times the amount of storage as the telephone system for the same digital sound. Certain types of sounds can be recorded at lower rates without loss of quality. Some standard rates are listed in the table.

Standard Sampling Rates
Attributes Quality and Usage MB/Minute
(16 bit, mono)
8000Hz Low quality. Used for telephone systems. Good for speech. Not recommended for music. 0.960
11025Hz Fair quality. Good for speech and AM radio recordings. 1.323
22050Hz Medium quality. Good for TV and FM radio quality music. 2.646
44100Hz High quality. Used for audio CDs. 5.292
48000Hz High quality. Used for digital audio tapes (DAT). 5.760
96000Hz Very high quality. Used for DVD audio. 11.520

As explained in the section, the number of bits determines how accurately the amplitude of the waveform is captured. The figure below shows a real world waveform in red and the corresponding digital waveforms with 2 bit samples and 3 bit samples. Figure: Bits
Bits

You can see that adding a single bit greatly improves the way the digital waveform conforms to the real world waveform. The 2 bit waveform looks like a rough approximation with large steps. Several amplitudes are rounded to the same state, such as samples 9 through 11. This is a source of , explained later.

In the 3 bit waveform, no amplitudes are rounded to the same state. Each step is half the height of the 2 bit waveform, but it is still not perfect. From sample 1 to sample 2, there is a jump in the waveform, which also causes to a much lesser extent. You'll notice that samples 0 and 1 are below the real waveform and samples 2 and 3 are above the waveform. This occurs because there are no in-between states to accurately store those amplitude levels, so the digital waveform ends up straddling the real one. Therefore more states, and bits, are needed.

8 bit and 16 bit samples are common. In an 8 bit sample, there are 256 different states or levels of amplitude. 16 bit samples have 65,536 levels. This makes a huge difference it terms of sound quality. Audio stored as 8 bit samples will often have much more .

Samples can be stored as bits a couple of different ways. One way is to consider all the states as positive, with no values below zero. As shown in the figure above, the states 00, 01, 10, and 11 are the same as the positive numbers 0, 1, 2 and 3. This eliminates the need for a negative sign. Such samples are called unsigned. For 8 bit samples, the states would range from 0 to 255.

The other way is to use a form known as two's complement, which allows both positive and negative values. These samples are called signed. Since real world waveforms tend to fluctuate through a range of positive and negative values, signed samples are preferred. For 16 bit samples, the states would range from -32678 to 32767.

When a sample is stored using more than 8 bits, more than one byte is needed. The term endian is used to describe the way bytes are ordered in computer memory. It specifies the significance of the first byte in the group. A 16 bit sample, for example, requires exactly two bytes, byte A and byte B. They can be stored as A first, then B or as B first, then A. Generally a PC will store them one way and a Mac will store them the other way due to differences in the internal processor design of those systems.

Big endian order has the most significant byte stored first, making it similar to the way we read numbers. In the number 47, the 4 is first and is most significant and the 7 is last and is least significant. This ordering is used on Mac systems.

Little endian order has the least significant byte stored first, allowing some optimizations in processing. This ordering is used on Intel and PC systems.

Digital audio can have one or more channels. Single channel audio, referred to as a monaural (or mono) audio, contains information for only one speaker and is similar to AM radio. Two channel audio, or stereo audio, contains information for two speakers, much like FM stereo. Stereo sounds can add depth, but they require twice as much storage and processing time as mono sounds. Most movie theatres have advanced audio systems with 4 or more channels, which are capable of making sounds appear to come from certain directions. Audio containing more than 2 channels are referred to as multichannel. GoldWave currently supports up to 8 channels in a 7.1 surround sound format as shown in the table.

Channel Layout
Front left (FL) Center (C) Front right (FR) 3.1 5.1 7.1
Low frequency effects (LFE)
Back (5.1)/Side (7.1) left (SL) Back (5.1)/Side (7.1) right (SR)
Back (7.1) left (BL) Back (7.1) right (BR)

Due to the internal structure of Wave files and the , the "back" channels are stored before the "side" channels in 7.1 surround sound. Therefore GoldWave refers to the 5.1 surround channels as "back" channels and displays the 7.1 "side" channels below the "back" channels.