>
section 7 of 102 min read

7. Putting It All Together: The MP3 Encoder

Let's apply everything in this chapter to one real-world device. When you record audio with your phone and the file is encoded as MP3:

rendering diagram...

Reading top-down:

  1. The microphone captures continuous-time audio: pressure variations in air, converted to voltage by a piezoelectric or condenser sensor.
  2. An anti-aliasing filter (analog, cutoff around 20 kHz) blocks high frequencies before sampling. Section 3.4.
  3. Sampling and quantization turn the analog signal into discrete-time, discrete-amplitude samples, typically 44.1 kHz, 16 bits per sample. Now we have x[n]x[n]. Section 3.
  4. The encoder splits x[n]x[n] into short frames (about 25 ms each, 1152 samples). Audio is locally stationary on this scale; treat each frame independently.
  5. Each frame is filtered into 32 sub-bands by a polyphase filterbank, then each sub-band is processed with a Modified Discrete Cosine Transform (MDCT), a Fourier relative tuned for spectral analysis of overlapping windows. Now we know how much energy is at each frequency. Section 2.
  6. A psychoacoustic model decides, for each sub-band, the minimum precision (number of bits) needed to encode that frequency without audible artifacts. Quiet sub-bands near loud ones get fewer bits; they are "masked" by their neighbors. This is where MP3 really earns its compression: throw away inaudible information.
  7. The sub-band coefficients are quantized with the chosen precision.
  8. The bits are Huffman-coded for further compression.
  9. The compressed bitstream is the MP3 file.

Decoding is the reverse: bitstream → Huffman decode → de-quantize → inverse transform → discrete-time audio → DAC → analog audio → speaker. The reconstruction filter at the DAC is exactly Section 3.5.

Every concept in this chapter shows up: sampling theorem (step 3), Fourier (step 5), LTI (the filterbank), quantization (step 7). You can't build MP3 without this chapter, and you can't break a side-channel attack without it either, as the next section makes concrete.