9. Sheet Resistance, Capacitance, Wire Delay // VLSI Design // bhaswanth

When wires were short (in old nodes), wire delay was negligible compared to gate delay. Today wires often dominate, and the layout strategy is largely about wire management.

9.1 Sheet resistance

For any thin film, the resistance of a rectangular slab is

$R = \rho \frac{L}{Wt} = R_s \frac{L}{W}, \qquad R_s = \rho/t$

where $\rho$ is resistivity, $t$ is film thickness, and $R_s$ is the sheet resistance in ohms per square. The "per square" refers to the dimensionless ratio L/W: a square of any side has resistance $R_s$ .

Typical $R_s$ :

Polysilicon: 20 to 40 Ω/square (heavily doped).
Salicided poly: 5 to 10 Ω/square (a thin metal silicide reduces resistance).
Diffusion: similar to salicide.
Metal 1 to 4 (thin): 0.1 to 0.3 Ω/square.
Metal 9 to 12 (thick): 0.02 to 0.05 Ω/square.

You see immediately why long signal wires belong on upper metals and why poly is bad for anything but short connections.

9.2 Capacitance per unit area

Each metal layer has capacitance to:

The substrate or lower metals (vertical, "to ground" cap).
Adjacent wires on the same layer (horizontal, "coupling" cap).

Coupling cap is the dominant one in modern designs and is the source of crosstalk noise. The fix is to keep parallel wires short or to insert shield wires (fixed to V_DD or GND) between them.

9.3 RC delay of a wire

Approximate the wire as a distributed RC line. The delay scales as:

$t_{wire} \approx 0.69 \cdot R_{wire} \cdot C_{wire} = 0.69 \cdot R_s (L/W) \cdot c \cdot L W = 0.69 R_s c L^2$

The squared dependence on length is the fundamental problem. Doubling the length of a wire quadruples its delay. This is why long buses are broken up by repeaters every few millimeters.

9.4 Buffer chains and the $e \approx 2.7$ optimum

If you must drive a load $C_L$ much larger than your input capacitance $C_{in}$ , a single-stage drive is too slow (because to grow strong enough you must grow your input cap, which slows the previous stage). Instead, insert a chain of inverters with each stage $f$ times bigger than the last. The total delay is

$t_{total} = N \cdot f \cdot \tau_0$

where $\tau_0$ is the intrinsic per-stage delay and $f$ is the per-stage scaling factor, with $f^N = C_L/C_{in}$ . Substituting $N = \ln(C_L/C_{in})/\ln f$ :

$t_{total} = \tau_0 \ln(C_L/C_{in}) \cdot \frac{f}{\ln f}$

To minimize over $f$ : take $d/df$ of $f/\ln f$ , set to zero, get $\ln f = 1$ , so $f = e \approx 2.718$ .

In practice, designers use $f = 4$ ("FO4 delay" is the propagation delay of an inverter driving four identical inverters and is the standard speed metric of a process). Using $f = 4$ instead of $e$ gives slightly more delay but uses fewer stages, which saves area and power.

9.5 Wire delay vs gate delay at modern nodes

At 180 nm, gate delay was ~30 ps and wire delay (for typical interconnect lengths) was negligible. At 7 nm, gate delay is ~5 ps but a millimeter-long wire on M2 has 200+ ps of delay. Wires are the bottleneck. This drives:

Aggressive use of upper, thick metals for long signals.
Repeater insertion every few hundred microns.
3D stacking (HBM, chiplets) to keep distances short.

9.1 Sheet resistance

9.2 Capacitance per unit area

9.3 RC delay of a wire

9.4 Buffer chains and the e≈2.7e \approx 2.7e≈2.7 optimum

9.5 Wire delay vs gate delay at modern nodes

9.4 Buffer chains and the $e \approx 2.7$ optimum