What translates analog voice signals into digital data and uses the Internet to transport the data?

A codec, which stands for coder-decoder, converts an audio signal into compressed digital form for transmission and then back into an uncompressed audio signal for replay. It's the essence of VoIP.

Table of Contents Show

Translating from Analog to Digital
How Telephones Transmit Voice Data
How Instant Messenger Transmits Voice Data
Networking Your Car
MANAGEMENT FOCUS
Voice over Internet Protocol (VoIP)

Codecs accomplish the conversion by sampling the audio signal several thousand times per second. For instance, a G.711 codec samples the audio at 64,000 times a second. It converts each tiny sample into digitized data and compresses it for transmission. When the 64,000 samples are reassembled, the pieces of audio missing between each sample are so small that to the human ear, it sounds like one continuous second of audio signal. There are different sampling rates in VoIP depending on the codec being used:

64,000 times per second
32,000 times per second
8,000 times per second

A G.729A codec has a sampling rate of 8,000 times per second and is the most commonly used codec in VoIP.

Codecs use advanced algorithms to help sample, sort, compress and packetize audio data. The CS-ACELP algorithm (CS-ACELP = conjugate-structure algebraic-code-excited linear prediction) is one of the most prevalent algorithms in VoIP. CS-ACELP organizes and streamlines the available bandwidth. Annex B is an aspect of CS-ACELP that creates the transmission rule, which basically states "if no one is talking, don't send any data." The efficiency created by this rule is one of the greatest ways in which packet switching is superior to circuit switching. It's Annex B in the CS-ACELP algorithm that's responsible for that aspect of the VoIP call.

The codec works with the algorithm to convert and sort everything out, but it's not any good without knowing where to send the data. In VoIP, that task is handled by soft switches.

E.164 is the name given to the standard for the North American Numbering Plan (NANP). This is the numbering system that phone networks use to know where to route a call based on the dialed numbers. A phone number is like an address:

(313) 555-1212

313 = State

555 = City

1212 = Street address

The switches use "313" to route the phone call to the area code's region. The "555" prefix sends the call to a central office, and the network routes the call using the last four digits, which are associated with a specific location. Based on that system, no matter where you're in the world, the number combination "(313) 555" always puts you in the same central office, which has a switch that knows which phone is associated with "1212."

The challenge with VoIP is that IP-based networks don't read phone numbers based on NANP. They look for IP addresses, which look like this:

IP addresses correspond to a particular device on the network like a computer, a router, a switch, a gateway or a telephone. However, IP addresses are not always static. They're assigned by a DHCP server on the network and change with each new connection. VoIP's challenge is translating NANP phone numbers to IP addresses and then finding out the current IP address of the requested number. This mapping process is handled by a central call processor running a soft switch.

The central call processor is hardware that runs a specialized database/mapping program called a soft switch. Think of the user and the phone or computer as one package -- man and machine. That package is called the endpoint. The soft switch connects endpoints.

Soft switches know:

Where the network's endpoint is
What phone number is associated with that endpoint
The endpoint's current IP address

We'll talk more about soft switches and protocols next.

In the same way that digital computer data can be sent over analog telephone networks using analog transmission, analog voice data can be sent over digital networks using digital transmission. This process is somewhat similar to the analog transmission of digital data. A pair of special devices called codecs {code!decode) is used in the same way that a pair of modems is used to translate the data to send across the circuit. One codec is attached to the source of the signal (e.g., a telephone or the local loop at the end office) and translates the incoming analog voice signal into a digital signal for transmission across the digital circuit. A second codec at the receiver’s end translates the digital data back into analog data.

Translating from Analog to Digital

Analog voice data must first be translated into a series of binary digits before they can be transmitted over a digital circuit. This is done by sampling the amplitude of the sound wave at regular intervals and translating it into a binary number. Figure 3.24 shows an example where eight different amplitude levels are used (i.e., each amplitude level is represented by three bits). The top diagram shows the original signal, and the bottom diagram, the digitized signal.

A quick glance will show that the digitized signal is only a rough approximation of the original signal. The original signal had a smooth flow, but the digitized signal has jagged "steps."

Figure 3.24 Pulse amplitude modulation (PAM)

The difference between the two signals is called quantizing error. Voice transmissions using digitized signals that have a great deal of quantizing error sound metallic or machinelike to the ear.

There are two ways to reduce quantizing error and improve the quality of the digitized signal, but neither is without cost. The first method is to increase the number of amplitude levels. This minimizes the difference between the levels (the "height" of the "steps") and results in a smoother signal. In Figure 3.24, we could define 16 amplitude levels instead of eight levels. This would require four bits (rather than the current three bits) to represent the amplitude, thus increasing the amount of data needed to transmit the digitized signal.

No amount of levels or bits will ever result in perfect-quality sound reproduction, but in general, seven bits (27 = 128 levels) reproduces human speech adequately. Music, on the other hand, typically uses 16 bits (216 = 65,536 levels).

The second method is to sample more frequently. This will reduce the "length" of each "step," also resulting in a smoother signal. To obtain a reasonable-quality voice signal, one must sample at least twice the highest possible frequency in the analog signal. You will recall that the highest frequency transmitted in telephone circuits is 4,000Hz. Thus, the methods used to digitize telephone voice transmissions must sample the input voice signal at a minimum of 8,000 times per second. Sampling more frequently than this (called oversampling) will improve signal quality. RealNet works.com, which produces Real Audio and other Web-based tools, sets its products to sample at 48,000 times per second to provide higher quality. The iPod and most CDs sample at 44,100 times per second and use 16 bits per sample to produce almost error-free music. MP3 players often sample less frequently and use fewer bits per sample to produce smaller transmissions, but the sound quality may suffer.

How Telephones Transmit Voice Data

When you make a telephone call, the telephone converts your analog voice data into a simple analog signal and sends it down the circuit from your home to the telephone company’s network. This process is almost unchanged from the one used by Bell when he invented the telephone in 1876. With the invention of digital transmission, the common carriers (i.e., the telephone companies) began converting their voice networks to use digital transmission. Today, all of the common carrier networks use digital transmission, except in the local loop (sometimes called the last mile), the wires that run from your home or business to the telephone switch that connects your local loop into the telephone network. This switch contains a codec that converts the analog signal from your phone into a digital signal. This digital signal is then sent through the telephone network until it hits the switch for local loop for the person you are calling. This switch uses its codec to convert the digital signal used inside the phone network back into the analog signal needed by that person’s local loop and telephone. See Figure 3.25.

There are many different combinations of sampling frequencies and numbers of bits per sample that could be used. For example, one could sample 4,000 times per second using 128 amplitude levels (i.e., 7 bits) or sample at 16,000 times per second using 256 levels (i.e., 8 bits).

Figure 3.25 Pulse amplitude modulation (PAM)

The North American telephone network uses pulse code modulation (PCM). With PCM, the input voice signal is sampled 8,000 times per second. Each time the input voice signal is sampled, 8 bits are generated.5 Therefore, the transmission speed on the digital circuit must be 64,000 bps (8 bits per sample x 8,000 samples per second) to transmit a voice signal when it is in digital form. Thus, the North American telephone network is built using millions of 64 Kbps digital circuits that connect via codecs to the millions of miles of analog local loop circuits into the users’ residences and businesses.

How Instant Messenger Transmits Voice Data

A 64 Kbps digital circuit works very well for transmitting voice data because it provides very good quality. The problem is that it requires a lot of capacity.

Adaptive differential pulse code modulation (ADPCM) is the alternative used by IM and many other applications that provide voice services over lower-speed digital circuits. ADPCM works in much the same way as PCM. It samples incoming voice signal 8,000 times per second and calculates the same 8-bit amplitude value as PCM. However, instead of transmitting the 8-bit value, it transmits the difference between the 8-bit value in the last time interval and the current 8-bit value (i.e., how the amplitude has changed from one time period to another). Because analog voice signals change slowly, these changes can be adequately represented by using only 4 bits. This means that ADPCM can be used on digital circuits that provide only 32 Kbps (4 bits per sample x 8,000 samples per second = 32,000 bps).

Networking Your Car

MANAGEMENT FOCUS

Cars are increasingly becoming computers on wheels. About 30% of the cost of a car lies in its electronics — chips, networks, and software. Computers have been used in cars for many years for driving control (e.g., engine management systems, antilock brakes, air bag controls), but as CD players, integrated telephones (e.g., Cadillac’s OnStar), and navigation systems become more common, the demands on car networks are quickly increasing. More manufacturers are moving to digital computer controls rather than traditional analog controls for many of the car’s basic functions (e.g., BMW’s iDrive), making the car network a critical part of car design.

In many ways, a car network is similar to a local area network. There are a set of devices (e.g., throttle control, brakes, fuel injection, CD player, navigation system) connected by a network. Traditionally, each device has used its own proprietary protocol. Today, manufacturers are quickly moving to adopt standards to ensure that all components work together across one common network. One common standard is Media-Oriented Systems Transport (MOST). Any device that conforms to the MOST standard can be plugged into the network and can communicate with the other devices.

The core of the MOST standard is a set of 25 or 40 megabit per second fiber-optic cables that run throughout the car. Fiber-optic cabling was chosen over more traditional coaxial or twisted pair cabling because it provides a high capacity sufficient for most future predicted needs, is not susceptible to interference, and weighs less than coaxial or twisted pair cables. Compared to coaxial or twisted pair cables, fiber-optic cables saves hundreds of feet of cabling and tens of pounds of weight in a typical car. Weight is important in car design, whether it is a high performance luxury sedan or an economical entry level car, because increased weight decreases both performance and gas mileage.

As digital devices such as Bluetooth phones and Wi-Fi wireless computer networks become standard on cars, the push to digital networks will only increase.

Several versions of ADPCM have been developed and standardized by the ITU-T. There are versions designed for 8 Kbps circuits (which send 1 bit 8,000 times per second) and 16Kbps circuits (which send 2 bits 8,000 times per second), as well as the original 32 Kbps version. However, there is a trade-off here. Although the 32 Kbps version usually provides as good a sound quality as that of a traditional voice telephone circuit, the 8 Kbps and 16 Kbps versions provide poorer sound quality.

Voice over Internet Protocol (VoIP)

Voice over Internet Protocol (VoIP, pronounced voyp) is commonly used to transmit phone conversations over digital networks. VoIP is a relatively new standard that uses digital telephones with built-in codecs to convert analog voice data into digital data (see Figure 3.26).

Figure 3.26 VoIP phone

Because the codec is built into the telephone, the telephone transmits digital data and therefore can be connected directly into a Local Area Network, in much the same manner as a typical computer. Because VoIP phones operate on the same networks as computers, we can reduce the amount of wiring needed; with VoIP, we need to operate and maintain only one network throughout our offices, rather than two separate networks—one for voice and one for data. However, this also means that data networks with VoIP phones must be designed to operate in emergencies (to enable 911 calls) even when the power fails; they must have uninterruptable power supplies (UPS) for all network circuits.

One commonly used VoIP standard is G.722 wideband audio, which is a version of ADPCM that operates at 64 Kbps. It samples 8,000 times per second and produces eight bits per sample.

Because VoIP phones are digital, they can also contain additional capabilities. For example, high-end VoIP phones often contain computer chips to enable them to download and install small software applications so that they can function in many ways like computers.