>
section 5 of 117 min read

5. FIR Filter Design

5.1 Why FIR

Three big reasons to use FIR:

  1. Linear phase is always achievable by enforcing coefficient symmetry. Linear phase means every frequency component is delayed by the same amount, so the waveform shape is preserved (at the cost of constant total delay). Audio crossovers, image processing, modem equalizers, and biomedical signal processing all care about this.
  2. Always stable. No feedback, no poles outside the origin. You cannot accidentally make an FIR unstable. This is a relief for fixed-point implementations where IIR coefficient quantization could push poles outside the unit circle.
  3. Easy in fixed-point and easy to implement in hardware. No recursion, no concerns about quantization-induced instability, very forgiving of coefficient quantization, naturally maps onto MAC-array architectures and FPGA tapped-delay-lines.

The price: higher order than an equivalent IIR. The 2 kHz lowpass we designed in 4.1 needed an 8th-order IIR or a 100th-order FIR. Computation per sample roughly 10×10\times more.

5.2 Linear phase: types I, II, III, IV

For an FIR of length MM with coefficients h[n]h[n], n=0,,M1n = 0, \ldots, M-1, linear phase is obtained when

h[n]=±h[M1n]h[n] = \pm h[M - 1 - n]

i.e., the coefficient sequence is either symmetric (++) or antisymmetric (-) about its center. There are four types depending on the sign of the symmetry and the parity of MM:

TypeSymmetryLength MMNotes
IsymmetricoddMost general, supports LP/HP/BP/BS
IIsymmetricevenCannot do high-pass at ω=π\omega = \pi
IIIantisymmetricoddAlways zero at DC and ω=π\omega = \pi; for differentiators, Hilbert
IVantisymmetricevenZero at DC; for differentiators

Types I and II are typical for amplitude filtering (LP, HP, BP, BS). Types III and IV are for special-purpose linear systems like differentiators (multiply by jωj\omega in frequency, antisymmetric impulse response in time) and Hilbert transforms (90-degree phase shifter, antisymmetric).

The phase of a linear-phase FIR is

H(ejω)=ω(M1)/2\angle H(e^{j\omega}) = -\omega(M-1)/2

a perfectly linear function of ω\omega, with constant group delay (M1)/2(M-1)/2 samples. Independent of frequency.

5.3 The window method

The most intuitive FIR design: start with the ideal frequency response, take its inverse-DTFT to get the ideal impulse response, then truncate to a finite length using a window.

Step 1: ideal frequency response. For an ideal lowpass with cutoff ωc\omega_c:

Hd(ejω)={1ωωc0ωc<ωπH_d(e^{j\omega}) = \begin{cases} 1 & |\omega| \leq \omega_c \\ 0 & \omega_c < |\omega| \leq \pi \end{cases}

Step 2: ideal impulse response (inverse DTFT):

hd[n]=sin(ωcn)πnh_d[n] = \frac{\sin(\omega_c n)}{\pi n}

a sinc, infinite in extent, two-sided.

Step 3: window. Multiply by a finite-length window w[n]w[n] (typically symmetric, length MM, centered around zero):

h[n]=hd[n(M1)/2]w[n]h[n] = h_d[n - (M-1)/2]\,w[n]

The shift centers the response. The window product zeros out everything outside the window's support.

The truncation introduces Gibbs phenomenon: ripples near the cutoff frequency (about 9% overshoot at the discontinuity, no matter how long you make a rectangular window). Smoother windows trade transition-band width for stopband attenuation.

5.4 Common window functions

Each window has a different magnitude spectrum, and the FIR's frequency response is approximately the convolution of the ideal response (a rect) with the window's spectrum. The window's spectrum has a main lobe (whose width sets the transition-band width of the FIR) and side lobes (which set the stopband attenuation).

WindowMainlobe width (Δf at zero crossings)Peak side lobe
Rectangular4π/M4\pi/M13-13 dB
Bartlett (triangular)8π/M8\pi/M25-25 dB
Hann8π/M8\pi/M31-31 dB
Hamming8π/M8\pi/M41-41 dB
Blackman12π/M12\pi/M57-57 dB
Kaiser (β\beta)varies with β\betatunable

The story: rectangular window has the narrowest main lobe (best transition-band sharpness for given MM) but worst side lobes (13-13 dB stopband, often unacceptable). Hamming and Hann widen the main lobe by 2×2\times but get 40\sim -40 dB and 31\sim -31 dB side lobes respectively. Blackman widens further but reaches 57-57 dB side lobes. Kaiser is parameterized by β\beta; you pick β\beta to hit a target stopband attenuation, then compute the required MM to hit a target transition-band width.

Mainlobe vs side-lobe tradeoff. The window is a finite rectangular truncation of an infinite sinc. The narrower the truncation, the wider the resulting spectrum. The smoother (more tapered) the truncation, the lower the side lobes but the wider the main lobe. Always one against the other. Rectangular is the sharpest cutoff in time → narrowest mainlobe in frequency, but the abrupt edges generate large side lobes (think of the rect-sinc duality from Chapter 3). Tapering the window edges reduces side lobes but stretches the mainlobe.

For a target stopband attenuation AA (dB) and transition width Δω\Delta\omega, Kaiser's design formula:

MA7.952.285Δω+1M \approx \frac{A - 7.95}{2.285\,\Delta\omega} + 1

β={0.1102(A8.7)A>500.5842(A21)0.4+0.07886(A21)21A500A<21\beta = \begin{cases} 0.1102(A - 8.7) & A > 50 \\ 0.5842(A-21)^{0.4} + 0.07886(A-21) & 21 \leq A \leq 50 \\ 0 & A < 21 \end{cases}

So a Kaiser FIR with A=60A = 60 dB stopband and Δω=0.05π\Delta\omega = 0.05\pi needs M(608)/(2.2850.157)+1146M \approx (60 - 8)/(2.285 \cdot 0.157) + 1 \approx 146 taps. Plug into scipy:

python
from scipy import signal
M = 147  # odd, for Type I symmetry
wc = 0.3 * np.pi  # cutoff (normalized)
beta = signal.kaiser_beta(60)
h = signal.firwin(M, wc/np.pi, window=('kaiser', beta))
w, H = signal.freqz(h, worN=2048)

5.5 Frequency-sampling design

An alternative: specify the desired magnitude at MM uniformly spaced frequencies ωk=2πk/M\omega_k = 2\pi k/M, take the inverse-DFT of those values, and you get the FIR coefficients directly. The filter exactly matches at the specified frequencies but may overshoot in between.

Useful for arbitrary-shape responses (not just LP/HP/BP/BS), like equalizer filters whose target shape is given by listening tests, not a closed-form formula.

5.6 Parks-McClellan (Remez exchange) optimal equiripple

The most powerful classical FIR design method. Parks and McClellan (1972), building on Chebyshev's equiripple approximation theory, showed how to compute the FIR coefficients of length MM that minimize the maximum deviation from the desired response in a weighted sense.

Algorithm: iteratively adjust the locations of equal-amplitude ripples in passband and stopband until the maximum ripple is as small as possible. The result has equiripple behavior: ripples of equal amplitude spread evenly through passband and stopband. This is provably optimal (in the Chebyshev sense).

In scipy: signal.remez(M, [pass1, pass2, stop1, stop2], [pass_gain, stop_gain], Hz=fs). For most production FIRs needing minimum order, Parks-McClellan beats the window method by 10-30% in tap count.

5.7 IIR vs FIR: a comparison table

IIRFIR
Order for given specsLow (8–20 typical)High (50–500 typical)
StabilityCan be unstableAlways stable
Linear phaseHard (use all-pass equalizers)Easy (symmetric coefficients)
RecursionYesNo
Implementation cost per output sampleLowHigh
Memory (state)Few biquad statesLong delay line
Coefficient quantization sensitivityHighLow
Direct-from-analog designYes (bilinear, impulse invariance)No
Used inRF preselection, audio EQ, controlAudio crossovers, image, comms equalizers, biomedical

In practice: pick FIR when phase matters or quantization is severe, IIR when computation must be minimal and phase is unimportant.

5.8 FIR application: disk-drive read channel

Modern hard-drive read channels use FIR equalizers to compensate for the smearing of magnetic transitions on the disk. The channel has a known impulse response (the read-head's spatial transfer); the FIR equalizer is the inverse, sharpening the read pulses for clean detection. Linear phase is essential: any phase distortion would shift the timing of the pulses and break the bit-clock recovery. So FIR, every time.

5.9 FIR application: image-processing kernels

A 2D FIR is a matrix of coefficients. Convolution with the image is the basic operation behind blur, sharpen, edge detection, embossing. Photoshop's "Gaussian blur" is a 2D Gaussian-shaped FIR. Edge-detection kernels (Sobel, Prewitt) are antisymmetric 2D FIRs that compute discrete derivatives.

In computer vision and CNNs, the "convolutional layer" is a stack of learned 2D FIRs. The whole edifice of modern image AI rests on FIRs whose coefficients are tuned by gradient descent rather than designed analytically.