>
section 1 of 1112 min read

1. Discrete-Time Signals and Systems, Recapped and Sharpened

The first job is to make sure our discrete-time vocabulary is sharp. You met most of this in Chapter 3, but DSP demands a slightly more formal grip on it, so we will revisit each concept with the angle it gets used at later in the chapter.

1.1 Sequences and where they come from

A discrete-time signal x[n]x[n] is a function of an integer index nZn \in \mathbb{Z}. The brackets, instead of parentheses, are not decorative: they tell every reader of your code or paper that this is a sampled, integer-indexed sequence, not a continuous function.

Most of the discrete-time signals we will deal with come from sampling a continuous-time signal at uniform intervals TsT_s (the sample period), with sample rate fs=1/Tsf_s = 1/T_s:

x[n]=xa(nTs)x[n] = x_a(nT_s)

But there are sequences that are intrinsically discrete: stock prices at end of trading day, the count of packets on a network in each one-second window, the value of a register read every clock cycle. These do not come from sampling a continuous-time signal; they are discrete by birth.

A practical worry: every line of every audio file in your library, every captured power trace from a chip, every line of OFDM samples in a Wi-Fi packet, is a sequence x[n]x[n] stored as an array. The DSP we develop in this chapter is what manipulates that array.

1.2 The standard discrete-time sequences

You will see these constantly. Memorize their pictures.

Unit impulse δ[n]\delta[n]: equals 1 at n=0n=0, zero everywhere else. The discrete-time sibling of the Dirac delta. Crucially, δ[n]\delta[n] is well-behaved: it has finite value, finite energy, no distribution-theory subtleties. Every discrete-time signal can be written as a sum of weighted, shifted impulses:

x[n]=kx[k]δ[nk]x[n] = \sum_k x[k]\,\delta[n - k]

This is the discrete sifting identity, and it is the seed from which discrete convolution will sprout in Section 1.5.

Unit step u[n]u[n]: 1 for n0n \geq 0, zero for n<0n < 0.

Unit ramp r[n]=nu[n]r[n] = n \cdot u[n]: zero for n<0n < 0, then 0,1,2,3,0, 1, 2, 3, \ldots for n0n \geq 0.

Real exponential anu[n]a^n u[n]: decays if a<1|a| < 1, grows if a>1|a| > 1. Models RC discharge sampled at uniform intervals (you would expect et/τe^{-t/\tau} in continuous time; sample it and you get ana^n with a=eTs/τa = e^{-T_s/\tau}).

Complex exponential ejω0ne^{j\omega_0 n}: the discrete-time analog of ejω0te^{j\omega_0 t}. The eigenfunction of every LTI discrete-time system, exactly the same way sinusoids were eigenfunctions in continuous time. This is what frequency means in the discrete world.

Sinusoid cos(ω0n+ϕ)\cos(\omega_0 n + \phi): the projection of a complex exponential onto the real axis.

plaintext
     δ[n]

   1  │

   ───┼────────────  n →
   -2 -1 0 1 2 3
        ╳        (only the n=0 sample is nonzero)
plaintext
     u[n]
   1  │
      │   ●  ●  ●  ●  ●  ●  ●

   0  ●  ●
      ───┼────────────  n →
         0

1.3 Periodicity in discrete time, an unobvious twist

In continuous time, every cos(ω0t)\cos(\omega_0 t) is periodic. Pick any ω0>0\omega_0 > 0 and you get a periodic signal with period T=2π/ω0T = 2\pi/\omega_0. Always.

In discrete time, this is not true. A discrete-time sinusoid cos(ω0n)\cos(\omega_0 n) is periodic only if ω0/(2π)\omega_0/(2\pi) is rational. To see why: periodicity means cos(ω0(n+N))=cos(ω0n)\cos(\omega_0 (n+N)) = \cos(\omega_0 n) for some integer NN, which requires ω0N=2πk\omega_0 N = 2\pi k for some integer kk, which forces ω0/(2π)=k/N\omega_0/(2\pi) = k/N, a ratio of integers.

So cos(0.1πn)\cos(0.1\pi n) is periodic with period 20 (since 0.1π/(2π)=1/200.1\pi/(2\pi) = 1/20). But cos(n)\cos(n) (with ω0=1\omega_0 = 1 rad/sample) is not periodic at all, because 1/(2π)1/(2\pi) is irrational. It looks vaguely repetitive in a plot but never quite repeats.

A second twist, deeper: discrete-time frequencies are defined only modulo 2π2\pi. The signal ejω0ne^{j\omega_0 n} with ω0\omega_0 and ω0+2π\omega_0 + 2\pi are the same sequence, because ej2πn=1e^{j2\pi n} = 1 for any integer nn. So discrete frequencies live on the unit circle: every unique frequency is in [0,2π)[0, 2\pi) or equivalently [π,π)[-\pi, \pi). This is why we keep talking about the unit circle in the z-plane. The maximum representable frequency in a sampled system is ω=π\omega = \pi, corresponding to alternation 1,1,1,1,1, -1, 1, -1, \ldots. Any higher continuous-time frequency aliases down into this range, the very phenomenon we treated in Chapter 3.

1.4 Energy and power signals (discrete version)

A discrete-time signal has energy

E=n=x[n]2E = \sum_{n=-\infty}^{\infty} |x[n]|^2

If EE is finite, it is an energy signal. A finite-length burst, a single drum hit captured as a wav, a single power-trace acquisition: all energy signals.

If EE is infinite but the average power

P=limN12N+1n=NNx[n]2P = \lim_{N\to\infty}\,\frac{1}{2N+1}\sum_{n=-N}^{N} |x[n]|^2

is finite, it is a power signal. Every nontrivial periodic sequence is a power signal (infinite total energy because it goes on forever; finite average because it stays bounded).

Random sequences, like a captured stream of thermal noise, are also typically power signals. We will return to this when we discuss the power spectral density and matched filtering for signals corrupted by noise.

1.5 Discrete-time systems and their classifications

A discrete-time system is a transformation y[n]=T{x[n]}y[n] = T\{x[n]\}. We classify systems by which structural properties they obey, because each property gives us a different mathematical lever.

Linear: T{ax1+bx2}=aT{x1}+bT{x2}T\{ax_1 + bx_2\} = aT\{x_1\} + bT\{x_2\}. Superposition holds.

Time-invariant: delay the input, you delay the output by the same amount; otherwise nothing changes. T{x[nk]}=y[nk]T\{x[n - k]\} = y[n - k].

LTI: both linear and time-invariant. Most filters we will design fall here.

Memoryless / static: y[n]y[n] depends only on x[n]x[n]. A simple gain y[n]=2x[n]y[n] = 2 x[n] is memoryless. A delay y[n]=x[n1]y[n] = x[n-1] is not memoryless because the system has to remember the previous input.

Causal: y[n]y[n] depends only on x[n],x[n1],x[n2],x[n], x[n-1], x[n-2], \ldots (present and past inputs), never on x[n+1]x[n+1] or beyond. Real-time systems must be causal because the future has not happened. Off-line processing (working on a recorded file) can use noncausal systems if convenient.

BIBO stable: every bounded input (x[n]M|x[n]| \leq M) produces a bounded output (y[n]M|y[n]| \leq M'). For an LTI system, this is equivalent to

n=h[n]<,\sum_{n=-\infty}^{\infty} |h[n]| < \infty,

i.e., the impulse response is absolutely summable. Equivalently, all poles of H(z)H(z) lie strictly inside the unit circle. Same fact, two viewpoints.

Why absolute summability and not just energy? If h[n]\sum|h[n]| diverges, you can construct a bounded input (x[n]=sign(h[n])x[n] = \text{sign}(h[-n]), all ±1\pm 1) that drives the output to infinity. So the absolute-sum condition is exactly what you need to guarantee bounded outputs for every bounded input.

1.6 The convolution sum, derived (not memorized)

For an LTI system, the impulse response h[n]=T{δ[n]}h[n] = T\{\delta[n]\} is what comes out when you poke it with a unit impulse. The claim, which is the most important fact in this chapter, is that h[n]h[n] completely characterizes the system. Knowing h[n]h[n], you can compute the output for any input. Here is why.

Step 1. Use the sifting identity to write any input as a weighted sum of impulses:

x[n]=kx[k]δ[nk]x[n] = \sum_k x[k]\,\delta[n - k]

Step 2. Push it through the system. By linearity:

y[n]=T ⁣{kx[k]δ[nk]}=kx[k]T{δ[nk]}y[n] = T\!\left\{\sum_k x[k]\,\delta[n-k]\right\} = \sum_k x[k]\,T\{\delta[n-k]\}

Step 3. By time-invariance, T{δ[nk]}=h[nk]T\{\delta[n-k]\} = h[n-k].

Step 4. Putting it together:

y[n]=kx[k]h[nk]=(xh)[n]\boxed{\,y[n] = \sum_k x[k]\,h[n-k] = (x * h)[n]\,}

This is the discrete convolution sum. We did not assume convolution; we derived it from linearity plus time-invariance plus the sifting identity. That is why convolution is the operation in LTI theory: it is forced by the structure.

Echo-and-music analogy, sharpened. Picture a music recording x[n]x[n] being played in a hall whose impulse response (recorded by clapping once) is h[n]h[n]. What you hear, y[n]y[n], is the music with the hall's acoustics overlaid. Convolution combines them. Each instant of input music excites a copy of the hall response, scaled by the music's loudness at that instant; the copies overlap and sum into the heard sound. Convolution reverbs in audio plug-ins do exactly this with sampled impulse responses of famous halls.

A worked example to anchor it. Let x[n]=[1,2,3]x[n] = [1, 2, 3] (nonzero for n=0,1,2n = 0, 1, 2) and h[n]=[1,1,1]h[n] = [1, 1, 1] (a 3-tap moving sum). Then

y[n]=kx[k]h[nk]y[n] = \sum_k x[k]\,h[n-k]

For n=0n = 0: only k=0k = 0 contributes (since h[0k]h[0-k] is only nonzero for k=0,1,2k = 0, -1, -2, but x[k]x[k] is zero outside 0,1,20, 1, 2). y[0]=x[0]h[0]=1y[0] = x[0]\,h[0] = 1.

For n=1n = 1: k=0k = 0 gives x[0]h[1]=1x[0]h[1] = 1, k=1k = 1 gives x[1]h[0]=2x[1]h[0] = 2. So y[1]=3y[1] = 3.

For n=2n = 2: k=0k = 0 gives 11, k=1k=1 gives 22, k=2k=2 gives 33. y[2]=6y[2] = 6.

For n=3n = 3: k=1,2k = 1, 2 contribute. y[3]=2+3=5y[3] = 2 + 3 = 5.

For n=4n = 4: k=2k = 2 contributes. y[4]=3y[4] = 3.

So y[n]=[1,3,6,5,3]y[n] = [1, 3, 6, 5, 3]. Output length is Lx+Lh1=3+31=5L_x + L_h - 1 = 3 + 3 - 1 = 5 samples. The output is longer than either input; convolution stretches the time axis.

1.7 Difference equations, FIR vs IIR

Most LTI discrete systems we will design are described by a linear constant-coefficient difference equation:

k=0Naky[nk]=k=0Mbkx[nk]\sum_{k=0}^{N} a_k\,y[n-k] = \sum_{k=0}^{M} b_k\,x[n-k]

By convention a0=1a_0 = 1, so we can solve for y[n]y[n]:

y[n]=k=0Mbkx[nk]    k=1Naky[nk]y[n] = \sum_{k=0}^{M} b_k\,x[n-k] \;-\; \sum_{k=1}^{N} a_k\,y[n-k]

The first sum is the feed-forward part: a tapped-delay-line of past inputs. The second sum is the feedback part: a tapped-delay-line of past outputs, fed back into the equation. These two sums divide all our digital filters into two camps.

FIR (Finite Impulse Response). Set all ak=0a_k = 0 for k1k \geq 1 (no feedback). The output is a pure feed-forward sum of past inputs, with no recursion. The impulse response is exactly h[n]=bnh[n] = b_n for n=0,,Mn = 0, \ldots, M and zero otherwise. Finite length (hence the name). The whole impulse response is the coefficient vector. Always stable, always finite-length.

IIR (Infinite Impulse Response). At least one ak0a_k \neq 0 for k1k \geq 1. The output recursively depends on past outputs. The impulse response, in general, runs forever (decaying if stable, exploding if not). Hence "infinite" impulse response. IIR filters are typically much shorter (lower-order) than FIRs of equivalent magnitude response, but they trade nonlinear phase and potential stability problems for that compactness.

The choice between FIR and IIR is one of the most important design decisions in DSP, and we will treat each at length in Sections 4 and 5.

1.8 Z-transform recap, ROC and stability

From Chapter 3 we have

X(z)=n=x[n]znX(z) = \sum_{n=-\infty}^{\infty} x[n]\,z^{-n}

with the region of convergence (ROC) being the set of complex zz for which the sum converges. The ROC is part of the Z-transform; the same algebraic expression with two different ROCs corresponds to two different sequences (one causal, one anti-causal).

A few facts to keep on hand for filter design:

  • For a causal sequence, the ROC extends outward from the outermost pole. So if X(z)X(z) has poles at z=0.9|z| = 0.9 and z=0.5|z| = 0.5, and the sequence is causal, the ROC is z>0.9|z| > 0.9.
  • For the system to be both causal and stable, the ROC must include the unit circle. Combining the two facts above: a causal LTI system is stable iff all its poles lie strictly inside the unit circle.
  • A pole on the unit circle means marginal stability (the system rings forever without decaying). Strictly outside, instability.

The Z-transform of the basic sequences:

SequenceX(z)X(z)ROC
δ[n]\delta[n]11all zz
u[n]u[n]11z1\dfrac{1}{1 - z^{-1}}$
anu[n]a^n u[n]11az1\dfrac{1}{1 - a z^{-1}}$
anu[n1]-a^n u[-n-1]11az1\dfrac{1}{1 - a z^{-1}}$
nanu[n]n a^n u[n]az1(1az1)2\dfrac{a z^{-1}}{(1 - a z^{-1})^2}$
cos(ω0n)u[n]\cos(\omega_0 n)\,u[n]1z1cosω012z1cosω0+z2\dfrac{1 - z^{-1}\cos\omega_0}{1 - 2z^{-1}\cos\omega_0 + z^{-2}}$

Properties to remember by their narrative role:

  • Linearity. Combining systems algebraically is just adding their Z-transforms.
  • Time shift: x[nk]zkX(z)x[n - k] \leftrightarrow z^{-k} X(z). The factor z1z^{-1} is the algebraic alter ego of "delay by one sample." Every digital filter block diagram has z1z^{-1} blocks for delay registers.
  • Convolution becomes multiplication: x1x2X1X2x_1 * x_2 \leftrightarrow X_1 \cdot X_2. This is why we like the Z-transform (and its sibling, the DFT) so much: the pesky time-domain convolution turns into a much simpler frequency-domain multiplication.

The DTFT (Discrete-Time Fourier Transform) is just the Z-transform evaluated on the unit circle:

X(ejω)=nx[n]ejωnX(e^{j\omega}) = \sum_n x[n]\,e^{-j\omega n}

It is what we plot when we say "frequency response." It is continuous in ω\omega (defined for every real ω\omega, periodic with period 2π2\pi). On a computer we cannot represent a continuous function, so we will need a sampled version of it, which is the DFT, the topic of the next section.

1.9 Inverse Z-transform via partial fractions

Given a rational X(z)=N(z)/D(z)X(z) = N(z)/D(z), decompose into partial fractions. Each piece corresponds to a known time-domain sequence. Example:

X(z)=1(10.5z1)(10.8z1)X(z) = \frac{1}{(1 - 0.5 z^{-1})(1 - 0.8 z^{-1})}

Partial fractions:

X(z)=A10.5z1+B10.8z1X(z) = \frac{A}{1 - 0.5 z^{-1}} + \frac{B}{1 - 0.8 z^{-1}}

Solving (multiply out, equate numerators), we get A=5/3A = -5/3, B=8/3B = 8/3. Inverse-transform each piece (assuming causal):

x[n]=53(0.5)nu[n]+83(0.8)nu[n]x[n] = -\tfrac{5}{3}(0.5)^n u[n] + \tfrac{8}{3}(0.8)^n u[n]

For repeated poles or complex pole pairs, the partial fractions are slightly more involved but the pattern is the same.