Digital Signal Processing

Signals and Systems

Shazam identifies a song in 5 seconds among 100 million tracks. The mechanism: a spectrogram built via DSP. CD quality = 44100 Hz, because Nyquist proved in 1928 that transmitting audio up to 22 kHz requires exactly twice that sampling rate. Without that theorem there is no digital era. 5G uses OFDM - 3300 subcarriers, each an independent DSP channel. All of this - signals and systems. And it all begins with one idea: a signal is a sequence of numbers, a system is a rule for transforming it.

  • **Shazam fingerprinting:** in 5 seconds a spectrogram is built via STFT - the signal becomes a set of (time, frequency, amplitude) points, then a fingerprint is matched in the database. Direct application of LTI and convolution
  • **5G OFDM:** 3300 subcarriers, each an independent LTI channel. Multiplexing via inverse FFT - O(N log N) instead of O(N²). No DSP, no 5G
  • **Whisper / speech recognition:** the signal first becomes a log-mel spectrogram (convolution + FFT), then feeds into a transformer. DSP is the first layer of every speech system
  • **MRI and radar:** the impulse response of the medium is recovered via deconvolution. A medical image is the solution to an inverse convolution problem
  • **MP3/AAC compression:** the psychoacoustic model operates in the frequency domain (FFT + MDCT). Frequencies the ear cannot hear are removed - information lives in the spectrum, not in time

Discrete Signals

**A discrete signal** x[n] is a sequence of numbers where n is an integer. Each sample x[n] is the signal value at moment n. Digital audio, image pixels, stock prices - all discrete signals. When Shazam analyzes microphone input, it immediately sees exactly this: x[0], x[1], x[2], ... - numbers, numbers, numbers.

Three fundamental building blocks. **Unit impulse** delta[n] - 1 at n=0, zero everywhere else. Looks unremarkable - but it is the atom every signal is built from. **Unit step** u[n] - switches on at zero and stays on. **Exponential** a^n·u[n] - the model for any decaying response: a bell ring, a capacitor discharge, channel noise.

SignalDefinitionEnergy/Power
δ[n]1 when n=0, 0 otherwiseEnergy = 1
u[n]1 when n≥0, 0 when n<0Power = 1
a^n · u[n], |a|<1Decaying exponentialEnergy = 1/(1-a²)
cos(ω₀n)CosinePower = 1/2
A·δ[n-k]Impulse of amplitude A at moment kEnergy = A²

**Finite-energy signals** (sum|x[n]|² < inf) - audio, images. **Finite-power signals** (lim(1/N)sum|x[n]|² < inf) - periodic signals, noise. The impulse is an energy signal, a sinusoid is a power signal.

Any discrete signal decomposes into a sum of shifted impulses: x[n] = Σ x[k]·delta[n-k]. This is not a theorem - it is a definition. And it is the key to convolution: if a system is linear and time-invariant, knowing its response to a single impulse is enough to compute the response to any signal automatically.

Signal x[n] = 3·delta[n] - 2·delta[n-1] + delta[n-3]. What is x[1]?

Continuous Signals

**A continuous (analog) signal** x(t) is defined for all real t. A sound wave in air, the voltage at a microphone output, body temperature - continuous signals. A smartphone literally hears continuous air pressure. To pass it to the processor, sampling is required: x[n] = x(n·T_s), where T_s is the sampling period.

The continuous analog of the impulse is the Dirac delta delta(t): infinitely narrow, infinitely tall, with unit integral. Physically - a perfect click. The Heaviside step u(t) - powering on a circuit. Exponential e^(-at)·u(t) - a battery discharging. The same three building blocks as in the discrete world, only in continuous time.

PropertyDiscrete x[n]Continuous x(t)
ArgumentInteger nReal t
Impulseδ[n] (Kronecker)δ(t) (Dirac)
EnergyΣ|x[n]|²∫|x(t)|²dt
Periodicityx[n] = x[n+N]x(t) = x(t+T)
ProcessingDigital (DSP)Analog (circuits)
SpectrumPeriodic (DTFT)Aperiodic (FT)

**Nyquist-Shannon, 1928:** sampling at rate f_s allows perfect reconstruction of all frequencies up to f_s/2. CD format: f_s = 44100 Hz, ceiling = 22050 Hz - just above the 20 kHz limit of human hearing. If the sampling rate is too low, aliasing occurs: high frequencies masquerade as low ones, and the signal is irreversibly distorted.

DSP works with discrete signals - computers operate on numbers, not functions. But understanding the continuous world is critical: the analog-digital boundary is exactly where aliasing, jitter and quantization noise live - the three error sources that haunt every real system.

Energy of signal x(t) = 2·e^(-3t)·u(t). What is it?

LTI Systems

**LTI (Linear Time-Invariant)** - linear, time-invariant system. Two conditions. **Linearity:** if x1 → y1 and x2 → y2, then a·x1 + b·x2 → a·y1 + b·y2. **Time-invariance:** if x[n] → y[n], then x[n-k] → y[n-k] for any k. Most filters, amplifiers, acoustic rooms - LTI. That is why room reverberation is exactly the convolution of the signal with the room's impulse response.

The killer idea of LTI: the entire system is completely described by a single function - the **impulse response** h[n], the reaction to the unit impulse delta[n]. Record h[n] by firing a starting pistol in a concert hall and capturing the echo. That's everything needed. Any sound through that hall = convolution with h[n].

SystemLTI?Reason
y[n] = 3x[n] + 2x[n-1]YesLinear combination of shifts
y[n] = x[n]²NoNonlinear: (2x)² ≠ 2·x²
y[n] = x[-n]NoNot time-invariant: reflection + shift ≠ shift + reflection
y[n] = n·x[n]NoNot time-invariant: coefficient depends on n
y(t) = ∫x(τ)dτYesIntegrator - LTI

**BIBO stability** (bounded input → bounded output): an LTI system is stable if and only if sum(|h[n]|) < infinity. The impulse response must be absolutely summable.

Causality: h[n] = 0 for n < 0 - the system does not look into the future. All real-time systems are causal. Non-causal filters exist only in offline processing: when Audacity applies a linear-phase EQ to a recorded file, it can use future samples - the recording is already fully known.

System y[n] = x[n] + x[n]². Is it LTI?

Convolution

**Convolution** is the central operation in DSP. Output of an LTI system: y[n] = x[n] * h[n] = Σ x[k]·h[n-k] for all k. Each input sample x[k] spawns a scaled, shifted copy of the impulse response. All copies overlap - that is the output.

The manual algorithm: 1) Flip h[k] → h[-k], 2) Shift by n: h[n-k], 3) Multiply element-wise by x[k], 4) Sum. Flipping h is the key difference from correlation. It is precisely the flip that ensures causality: past inputs affect the current output, not the other way around.

PropertyFormulaApplication
Commutativityx * h = h * xOrder does not matter
Associativity(x * h₁) * h₂ = x * (h₁ * h₂)Cascaded filters
Distributivityx * (h₁ + h₂) = x*h₁ + x*h₂Parallel filters
Identity elementx * δ = xδ[n] is the 'unit' of convolution
Shiftx * δ[n-k] = x[n-k]Delay by k samples

**Convolution in the frequency domain is multiplication!** Y(omega) = X(omega) · H(omega). This is the basis of FFT-based filtering: instead of O(N²) convolution, we do O(N·logN) FFT + element-wise multiplication + inverse FFT.

Back to LTI: convolution is the mechanism that turns knowledge of h[n] into the response to any arbitrary input. Record the impulse response of Carnegie Hall - and that acoustic can be imposed on any recording. That is exactly what convolution reverbs do in professional audio: h[n] is a snapshot of a space.

Convolution in DSP history

Convolution was used in mathematics since the 18th century (Euler, Laplace), but became a practical tool with the advent of digital computers. The Cooley-Tukey FFT algorithm (1965) accelerated convolution from O(N²) to O(NlogN), making real-time audio and image processing feasible.

Convolution and correlation are the same thing

Convolution flips one of the functions (h[n-k]), while correlation does not (h[n+k]). For symmetric h they coincide; for asymmetric h they differ.

Convolution: y[n] = Σ x[k]·h[n-k] - h is flipped and shifted. Correlation: R[n] = Σ x[k]·h[n+k] - h is only shifted. The flip in convolution is needed so that a causal system gives a physically correct response: past inputs affect the current output. Correlation measures similarity; convolution computes the response.

x[n] = {1, 2, 3}, h[n] = {1, 1}. What is y[0] = (x * h)[0]?

Key Ideas

  • **x[n]** - discrete signal = sequence of numbers. Any signal decomposes into shifted impulses: x[n] = Σ x[k]·δ[n-k]
  • **LTI system:** linear + time-invariant. Completely described by the impulse response h[n] - the reaction to a unit impulse
  • **Convolution** y[n] = Σ x[k]·h[n-k] - output of an LTI system for any input. Know the response to a click - know the response to a symphony. O(N²) directly, O(N log N) via FFT
  • **Convolution ≠ correlation:** convolution flips h (causality), correlation does not (similarity). They coincide for symmetric h
  • **Nyquist 1928 → Shazam, 5G, Whisper:** one theorem about sampling rate underlies all of digital signal processing

Related Topics

Signals and systems are the foundation for all of DSP:

  • Sampling and the Nyquist Theorem — Why CD = 44100 Hz, not 22050. Aliasing - what happens when the theorem is violated
  • Z-Transform — Convolution in the time domain = multiplication in the Z-domain. Filters via poles and zeros

Вопросы для размышления

  • The system y[n] = median(x[n-1], x[n], x[n+1]) is a median filter. Is it LTI? Why is the median nonlinear while the mean is not?
  • Convolution is commutative: x * h = h * x. But physically the signal passes through the system, not the other way around. How does that reconcile with the mathematical equality?
  • Shazam records 5 seconds of audio and builds a spectrogram. Which operations from this lesson are involved? Where exactly does convolution appear?

Связанные уроки

  • calc-01-sequences — Calculus is essential for DSP: derivatives (signal rate of change), integrals (energy), Fourier series
  • cg-01 — Convolution in DSP (signal filtering) is mathematically identical to convolution in image processing (blur, sharpen)
  • it-01 — Information theory and DSP are linked through the Nyquist-Shannon sampling theorem
  • arvr-01 — DSP is used in AR/VR for spatial audio (binaural), IMU processing, and noise cancellation
  • trig-11
Signals and Systems

0

1

Sign In