Digital Signal Processing/Sound Processing

Digital Sound edit

Sampling Frequency edit

Sound in the digital realm is stored in one or more arrays of discrete samples, with each array of samples correlating to a channel (e.g. stereo sound requires two channels, and thus two arrays of samples). The interval of time between each sample is a constant, and is determined by the type of data to be represented. Since we are interested in sound, and the extreme upper limit of human hearing is generally accepted as 20 kHz, the Nyquist-Shannon sampling theorem can be used to determine the interval between samples to accurately re-construct the signals we're interested in.

This theorem states that

Exact reconstruction of a continuous-time baseband signal from its samples is possible if the signal is bandlimited and the sampling frequency is greater than twice the signal bandwidth.

Essentially what this means is a signal that is limited to a certain range (audible sound: ~20 Hz to 20 kHz) can be reconstructed without error if it is sampled at a rate that is greater than twice the bandwidth. The Red Book audio CD standard sets the sampling rate at 44,100 Hz. This frequency was chosen to leave ample overhead (as required by the Nyquist-Shannon theorem), but could support at least up to 22 kHz.

44.1 kHz is the general standard for sampling rates in digital audio on consumer level equipment, however 48 kHz is common when working with film or video. Also, many recording engineers prefer to record classical or otherwise complex music at 88.2 or 96 kHz—some claim to be able to perceive a difference.

When converting from 48 kHz to 44.1 kHz a sonic blurring effect can sometime occur, because the math is floating point, which is inherently imprecise on a computer. The conversion from 88.2 kHz to 44.1 kHz or 96 kHz to 48 kHz is simpler to perform, since the computer, or device, doing the conversion only has to disregard half the samples. To bypass this problem, a high-quality digital-to-analog converter can be used to bring, for example, a 48 kHz signal back to its analog form, and then is fed into another high-quality analog-to-digital converter to re-sample the signal at 44.1 kHz. This technique is common practice in recording studios where high-end equipment can be trusted to do the conversion flawlessly, however in other situations, the sonic distortion caused by converting the audio in software or hardware may be of little concern.

Bits Per Sample edit

While sampling frequency determines the time component of an audio signal, the number of bits per sample is used to describe the amplitude. Red Book audio CDs store each sample as a 16 bit signed integer. This means that when an audio signal is converted for use on a CD, each sample's value is quantized as an integer to fit in the range -32768 to +32767.

Wave Files edit

Wave files contain data which is a representation of audio sound. This format for storing data is an uncompressed format. This means the data can be sent to the digital-to-analog processor for playback without an added step of decompression. This also means that this format will consume a great deal of memory.

MP3 Compression edit

OGG Compression edit