5. The CMOS Inverter, Closely Examined // VLSI Design // bhaswanth

The CMOS inverter is the hydrogen atom of digital design. Master it and the rest follows.

5.1 DC transfer characteristic

Sweep $V_{in}$ from 0 to $V_{DD}$ and measure $V_{out}$ :

plaintext

   V_out
    ^
 V_DD ─┐
       │\
       │ \
       │  \
       │   \
       │    \
       │     \
       │      \________
  GND ─┘──────────── V_in
       0    V_M    V_DD

The S-shaped curve has five regions:

$V_{in} < V_{tn}$ : NMOS off, PMOS in linear. $V_{out} \approx V_{DD}$ .
$V_{tn} < V_{in} < V_M$ : NMOS in saturation, PMOS in linear. $V_{out}$ dropping but still high.
$V_{in} \approx V_M$ : both saturated. Output transitions sharply.
$V_M < V_{in} < V_{DD} - |V_{tp}|$ : NMOS linear, PMOS saturation. $V_{out}$ low.
$V_{in} > V_{DD} - |V_{tp}|$ : PMOS off, NMOS linear. $V_{out} \approx 0$ .

5.2 Switching threshold $V_M$

The switching threshold $V_M$ is where $V_{in} = V_{out}$ . Setting NMOS and PMOS currents equal in saturation:

$V_M = \frac{V_{tn} + r(V_{DD} - |V_{tp}|)}{1 + r}, \quad r = \sqrt{\frac{\mu_p C_{ox} (W/L)_p}{\mu_n C_{ox} (W/L)_n}}$

Set $V_M = V_{DD}/2$ and you find that you need PMOS to be ~2 to 2.5x wider than NMOS to compensate for the lower hole mobility. This is the classic "2:1 P:N ratio" rule of thumb.

5.3 Noise margins

The high noise margin is how much noise on a 1-input the inverter can absorb without flipping its output:

$NM_H = V_{OH} - V_{IH}, \qquad NM_L = V_{IL} - V_{OL}$

where $V_{IL}$ and $V_{IH}$ are the input voltages at which the gain of the inverter equals -1. For a balanced CMOS inverter at $V_{DD} = 1.0$ V, both noise margins are typically around 0.4 V. Compare this to nMOS, where $V_{OL}$ does not even reach 0; CMOS has clean 0/1 levels and that translates to far better noise immunity.

5.4 Propagation delay

The propagation delay is the time from a 50 percent input transition to a 50 percent output transition. For an inverter driving a load $C_L$ :

$t_{pHL} \approx \frac{C_L V_{DD}/2}{I_{DS,n}}, \qquad t_{pLH} \approx \frac{C_L V_{DD}/2}{I_{DS,p}}$

Average propagation delay $t_p = (t_{pHL} + t_{pLH})/2$ . The " $I_{DS}$ " here is an effective drive current that designers approximate by integrating the actual saturation/linear behavior, but the scaling intuition is:

Larger $C_L$ → slower.
Larger $V_{DD}$ → faster (but quadratic dynamic power penalty).
Larger $W/L$ (stronger drive) → faster, but you also load the previous stage more.

5.5 Dynamic power, derived

When a CMOS gate's output rises from 0 to $V_{DD}$ , energy from the supply moves through the PMOS into the load capacitance $C_L$ . The total energy drawn from the supply is $E_{rise} = C_L V_{DD}^2$ , of which half ( $\frac{1}{2}C_L V_{DD}^2$ ) ends up stored on the capacitor and half is dissipated in the PMOS resistance.

When the output falls back to 0, the stored energy $\frac{1}{2}C_L V_{DD}^2$ on the capacitor dissipates through the NMOS. The supply does not sink the energy back; it just sat there during the fall.

So per full cycle (rise + fall) the energy dissipated is $C_L V_{DD}^2$ . If the gate switches at frequency $f$ , the dynamic power is

$P_{dyn} = \alpha C_L V_{DD}^2 f$

where $\alpha$ is the switching activity factor, the probability that the gate switches per clock cycle (typical values are 0.1 to 0.3). This is the most important formula in digital power. The quadratic dependence on $V_{DD}$ is why lowering supply voltage is the single biggest lever for low-power design. The linear dependence on $f$ is why we clock-gate unused parts of a chip.

5.6 Short-circuit power

During the brief transition window when both NMOS and PMOS are partially on, a short-circuit current spikes from $V_{DD}$ to GND. This is typically 5 to 10 percent of dynamic power for normal slew rates; it grows if input transitions are slow. Buffer chains mitigate this by keeping slews fast.

5.7 Leakage power

$P_{leak} = V_{DD} I_{leak}$ , with $I_{leak}$ dominated by subthreshold conduction at modern nodes. On a 7 nm chip this can be tens of percent of total power even when "idle."

5.8 Putting it together

$P_{total} = \alpha C_L V_{DD}^2 f \;+\; t_{sc}f V_{DD} I_{peak} \;+\; V_{DD} I_{leak}$

The total power on a real chip is the sum across millions of gates of these three terms. Most of the action in low-power design is making each term as small as possible without breaking timing.