>
section 2 of 109 min read

2. Multistage Amplifiers: Cascading for More Gain

A single transistor stage gives you maybe 50 to 200 of voltage gain. Audio preamps need thousands. Instrumentation front-ends need millions. ECG amplifiers need to extract a few-millivolt signal out of a noisy electrode interface, which means input-referred noise in the nanovolts and gain in the high tens of thousands. So we cascade.

2.1 Why cascade

Three concrete reasons:

  1. More total gain. Two stages of 100 each give 10,000. Three stages of 50 give 125,000.
  2. Specialized roles. The first stage can be optimized for high input impedance (so the source is not loaded), the middle stages for raw gain, and the last stage for low output impedance (so the load is not starved).
  3. Buffering. Many sources need very high input impedance: piezo sensors, electret microphones, glass-electrode pH probes, biopotential electrodes. Many loads need very low output impedance: speakers, transmission lines, motors. One transistor cannot do both at once. Cascading lets you partition the job.

Restaurant-kitchen analogy. The chef who garnishes plates is great at fine motor control but cannot wrestle the 25 kg sack of flour. The prep cook who breaks down whole carcasses cannot do delicate plating. A working kitchen has multiple stations, each specialized for one task, and food moves through the line. A multistage amplifier is the same: each stage does what it is good at, and the signal moves through the stations.

2.2 Coupling between stages

Between stages we need to transfer the AC signal but block one stage's DC bias from disturbing the next. Three options exist, each with tradeoffs.

RC (capacitor) coupling. A series capacitor between stages.

  • Cheapest and simplest. Used in audio preamps, guitar pedals, scope front-ends.
  • Frequency response: low cutoff fL=1/(2πRC)f_L = 1/(2\pi R C) where RR is the parallel combination of the previous stage's output impedance and the next stage's input impedance. Below this, the cap blocks the signal.
  • DC drift of one stage does not affect the next, because the cap blocks DC. This is huge for stability.
  • Sensitive at low frequencies. Subwoofer-grade audio equipment needs large coupling caps (10 µF or more) to keep the low cutoff well below 20 Hz.

Transformer coupling. A small transformer between stages.

  • Provides DC isolation and impedance matching, in one component.
  • Used in classic vacuum-tube audio output stages, RF amplifiers needing clean 50 Ω impedance matching, and balanced audio interfaces (the famous Jensen and Lundahl audio transformers).
  • Bulky, expensive, limited bandwidth at both ends, and the magnetic core can saturate on big signals causing distortion. Falling out of favor outside niche applications.

Direct coupling. Stages share DC bias.

  • No low-frequency cutoff. DC passes through. Useful for amplifiers that have to handle very slow signals (chopper amps, DC servos, instrumentation amplifiers reading thermocouples).
  • DC drift accumulates. A small offset in stage 1 gets multiplied by stage 2's gain, then by stage 3's gain, and quickly becomes huge. A 1 mV input offset drifting at 10 µV/°C becomes a multi-volt nuisance after three stages of 100× gain.
  • Solved by clever bias circuits (current mirrors, level shifters, replica biasing) or by introducing feedback that controls the DC output (we get to feedback in section 3).
  • Used inside almost every modern analog IC. Op-amps are direct-coupled internally, with the DC servoing job done by the feedback you wrap around them externally.

2.3 Cascaded amplifier analysis

Total voltage gain equals the product of individual stage gains, adjusted for loading. Each stage's input impedance loads the previous stage's output. Compute the Thevenin equivalent at each interface, account for the loading, and multiply.

Concretely, if stage 1 has open-circuit gain A1A_1 and output impedance Ro1R_{o1}, and stage 2 has input impedance Ri2R_{i2}, then the gain seen at stage 2's input is

A1,loaded=A1Ri2Ro1+Ri2A_{1,\text{loaded}} = A_1 \cdot \frac{R_{i2}}{R_{o1} + R_{i2}}

That is just a Thevenin/voltage-divider operation (Chapter 2). The total gain is

Atotal=A1,loadedA2,loadedA3,loadedA_{\text{total}} = A_{1,\text{loaded}} \cdot A_{2,\text{loaded}} \cdot A_{3,\text{loaded}} \cdots

If you ignore loading and just multiply open-circuit gains you will systematically over-predict the actual amplifier gain, sometimes by an order of magnitude.

Bandwidth shrinks with cascading. This is the surprising part. Suppose each of nn stages has a single high-frequency pole at the same frequency BB. At the cascaded stages' shared 3-3 dB frequency, each stage drops 3 dB, so nn stages together drop 3n3n dB. The cascaded bandwidth is therefore not BB, but the frequency at which the combined response has fallen 3 dB. Working through the algebra:

Bn=B21/n1B_n = B \cdot \sqrt{2^{1/n} - 1}

For n=2n = 2: B2=0.64BB_2 = 0.64 B. For n=3n = 3: B3=0.51BB_3 = 0.51 B. For n=5n = 5: B5=0.39BB_5 = 0.39 B. So three stages of 1 MHz amplifiers do not give you 1 MHz total bandwidth, only about 510 kHz. Cascading buys gain at the cost of bandwidth, and the cost compounds.

A short Python calculation makes this concrete:

python
import numpy as np
 
def cascaded_bandwidth(n, B=1.0):
    """Combined -3 dB bandwidth of n identical single-pole stages."""
    return B * np.sqrt(2**(1.0/n) - 1)
 
for n in range(1, 8):
    print(f"n = {n}: cascaded BW = {cascaded_bandwidth(n):.3f} B")

Output:

plaintext
n = 1: cascaded BW = 1.000 B
n = 2: cascaded BW = 0.644 B
n = 3: cascaded BW = 0.510 B
n = 4: cascaded BW = 0.435 B
n = 5: cascaded BW = 0.387 B
n = 6: cascaded BW = 0.350 B
n = 7: cascaded BW = 0.323 B

The numbers tell the story. For modern instrumentation amplifiers, designers fight this by using fewer, higher-gain stages, and by spreading the rolloffs (so each stage rolls off at a different frequency rather than all at the same one).

2.4 Darlington pair

Cascade two BJTs so the emitter of the first feeds the base of the second:

plaintext
             C (composite collector, both Cs tied together)

   ┌──────────┤
   │          │
   B ────[Q1]─┤

              ├──────[Q2]───── E (composite emitter)
              │       │
              └───────┘

The "emitter" of the composite device is the emitter of Q2; the "collector" is both transistors' collectors tied together; the "base" is the base of Q1. The composite current gain is approximately β1β2\beta_1 \cdot \beta_2, easily 10,000 to 50,000.

Drawbacks:

  • The voltage drop across the composite is two VBEV_{BE}s (about 1.4 V), which cuts into your output swing in low-voltage systems and dissipates more power in the device.
  • Switching is slower because Q2's stored charge has to be pulled out through Q1, which takes time.
  • The Q1-emitter to Q2-base junction creates an internal positive-feedback path that can encourage parasitic oscillation in poorly-laid-out circuits.

Used in audio output stages (the TIP120 is the canonical NPN Darlington in a TO-220 package), motor drivers, high-current LED drivers, solenoid drivers, and Christmas-light controllers. Any time you need a power transistor with a base that is easy to drive from a 5 mA microcontroller pin, you reach for a Darlington.

2.5 Cascode amplifier (revisited)

We saw the cascode in section 1.3 as a Miller-killer. As an amplifier configuration, it gives you:

  • High gain (because Q2's CB stage has very high output impedance, and the gain into Q2's collector node can be enormous).
  • High bandwidth (because Q1's collector barely swings, so Q1's CμC_\mu is not Miller-multiplied).
  • High output impedance (good for driving a current-source load and for designing op-amp gain stages).

The penalty is voltage headroom. The cascode stack uses up two VCE,satV_{CE,sat} + one bias VBEV_{BE} worth of headroom from the supply rail. In a 1.2 V chip operating from a thinning-down silicon process, this is real estate you may not have. That is why modern low-voltage CMOS op-amps move to "folded cascode" topologies, which split the cascode current path so the input stage and the cascode stage share the supply rather than stacking.

Real-world example. Inside almost every audiophile-grade op-amp (the Burr-Brown OPA627, the Analog Devices AD797, the Linear Technology LT1115), the input differential pair feeds a folded-cascode load. That single topological choice gives them open-loop gains in excess of 120 dB and bandwidth in the tens of MHz from a few transistors.

2.6 The differential amplifier: the IC's favorite

Two transistors with their emitters joined and tied to a common current source (the tail current IEEI_{EE}):

plaintext
        V_CC
     │       │
     R_C     R_C
     │       │
  ───┤        ├───      (outputs V_o1 and V_o2)
  │   v_o1  v_o2  │
  v_in1 ──[Q1] [Q2]── v_in2

                E (joined emitters)

              ─────
              I_EE (tail current source)

              V_EE

Inputs go to the bases of Q1 and Q2. The output is taken either differentially (across the two collectors) or single-ended (one collector, with the other ignored).

The magic: common-mode signals cancel, differential signals are amplified.

If both inputs rise by the same amount ΔV\Delta V, both transistors try to draw more current. But the tail current is fixed by the current source, so they cannot. The result: both collector voltages drop by the same small amount (almost nothing, because the current source has very high output impedance), and the differential output is zero. Common-mode rejection.

If one input rises and the other falls (a differential signal), the imbalance steers the tail current, with Q1 drawing more and Q2 drawing less. The collector voltages move opposite ways, producing a strong differential output.

The differential pair is the input stage of every op-amp ever built. Its CMRR (common-mode rejection ratio), the ratio of differential gain to common-mode gain, is one of the headline op-amp specs, typically 80 to 120 dB. The reason a precision op-amp can extract a 10 µV thermocouple signal sitting on a 5 V common-mode is that common-mode signal is suppressed by 100 dB while the differential signal is amplified by 100 dB. That is six orders of magnitude of separation.

Tug-of-war analogy. Two equal teams pulling on a rope around a pulley. If both teams pull harder by the same amount (common-mode), the rope does not move; the forces cancel at the pulley. If one team pulls harder than the other (differential), the rope swings. The pulley/tail current source enforces "equal current in equals equal current out" and ignores anything that is the same on both sides.

Hardware-security tie-in. Differential signaling is a defense as well as a circuit topology. LVDS, RS-485, and PCIe all signal differentially partly to reject common-mode noise pickup, but also to hide the data from EM eavesdroppers: the differential pair radiates much less than a single-ended trace because the equal and opposite currents cancel in the far field. TEMPEST-rated equipment uses differential interconnects extensively. The cost of going single-ended to save board area is leakage.