Linear prediction of audio signals
Toon van Waterschoot and Marc Moonen
Katholieke Universiteit Leuven, ESAT-SCD, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium toon.vanwaterschoot@esat.kuleuven.be
http://homes.esat.kuleuven.be/∼tvanwate/
Linear prediction audio applications:
• audio compression (warped LP)
• lossless audio coding
• audio (spectral) analysis
• audio signal whitening
(feedback/echo cancellation)
Tonal audio signal model, t = 1, ..., M : x(t) = PN
n=1 A(n) sin[nω0t + φ(n)] + e(t) Residual spectral flatness measure:
SFMR = exp n
1 M
Pfs˜
f =0ln
|R(f )|2o
1 M
Pfs˜
f =0 |R(f )|2
0 0.5 1 1.5 2
x 104
−300
−250
−200
−150
−100
−50 0 50 100
f (Hz) 20log10|X(ej2πf/fs)|(dB)
• fs = 44.1 kHz
• ω0 = 2π/64 rad (f0 = 689.1 Hz)
• M = 2048 samples = 46.4 ms
• N = 15
• e(t) ≡ 0
• A(n) ∼ U(0, 1)
• φ(n) ∼ U(0, 2π)
LP method SFMR (dB) LP (autocorrelation) -6.92
LP (covariance) -52.97 Pole-zero model -1.19
High-order LP -0.55 Pitch prediction -15.37
Warped LP -5.17 Selective LP -1.76
Conventional linear prediction methods
'
&
$
%
Autocorrelation method
0 0.5 1 1.5 2
x 104
−30
−20
−10 0 10 20 30
f (Hz) 20log10|H(ej2πf/fs )|(dB)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−1 −0.5 0 0.5 1
−1
−0.8
−0.6
−0.4
−0.2 0 0.2 0.4 0.6 0.8 1
30
Real Part
Imaginary Part
'
&
$
%
Covariance method
0 0.5 1 1.5 2
x 104
−200
−150
−100
−50 0 50 100 150
f (Hz) 20log10|H(ej2πf/fs)|(dB)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−1 −0.5 0 0.5 1
−1
−0.8
−0.6
−0.4
−0.2 0 0.2 0.4 0.6 0.8 1
30
Real Part
Imaginary Part
⇓ ⇓
residual spectral flatness maximization high-resolution tonal components identification
Conjectures:
1. AR(2N ) model is not suited for achieving BOTH residual spectral flatness and and high-resolution tonal components identification, UNLESS tonal components are uniformly distributed in Nyquist interval (ω
0= π/(N + 1))
2. AR(2N ) model with covariance method fails to identify tonal components when noise is added to tonal signal (e(t) 6= 0)
Alternative linear prediction methods
'
&
$
%
Pole-zero modelling
0 0.5 1 1.5 2
x 104
−140
−120
−100
−80
−60
−40
−20 0 20
f (Hz) 20log10|H(ej2πf/fs)|(dB)
−1 −0.5 0 0.5 1
−1
−0.8
−0.6
−0.4
−0.2 0 0.2 0.4 0.6 0.8 1
Real Part
Imaginary Part
'
&
$
%
High-order AR modelling
0 0.5 1 1.5 2
x 104
−50
−40
−30
−20
−10 0 10 20
f (Hz) 20log10|H(ej2πf/fs)|(dB)
−1 −0.5 0 0.5 1
−1
−0.8
−0.6
−0.4
−0.2 0 0.2 0.4 0.6 0.8 1
2
2 2
2 2
2 1024
Real Part
Imaginary Part
'
&
$
%
Pitch prediction
0 0.5 1 1.5 2
x 104
−35
−30
−25
−20
−15
−10
−5 0 5 10
f (Hz) 20log10|H(ej2πf/fs)|(dB)
−1 −0.5 0 0.5 1
−1
−0.8
−0.6
−0.4
−0.2 0 0.2 0.4 0.6 0.8 1
64
Real Part
Imaginary Part
'
&
$
%
Warped linear prediction
0 0.5 1 1.5 2
x 104
−40
−30
−20
−10 0 10 20 30 40 50 60
f˜(Hz) 20log10|X(ej2π
˜ f/fs)|(dB)
0 0.5 1 1.5 2
x 104
−40
−30
−20
−10 0 10 20
f˜(Hz) 20log10|H(ej2π
˜ f/fs )|(dB)
−1 −0.5 0 0.5 1
−1
−0.8
−0.6
−0.4
−0.2 0 0.2 0.4 0.6 0.8 1
30
Real Part
Imaginary Part
'
&
$
%
Selective linear prediction
0 2000 4000 6000 8000 10000
−300
−250
−200
−150
−100
−50 0 50 100
f (Hz) 20log10|X(ej2πKf/fs)|(dB)
0 2000 4000 6000 8000 10000
−30
−20
−10 0 10 20 30
f (Hz) 20log10|H(ej2πKf/fs)|(dB)
−1 −0.5 0 0.5 1
−1
−0.8
−0.6
−0.4
−0.2 0 0.2 0.4 0.6 0.8 1
30
Real Part
Imaginary Part
See also: T. van Waterschoot and M. Moonen, “Comparison of linear prediction models for audio signals,” EURASIP J. Audio, Speech, Music Process., submitted for publication. ftp:
//ftp.esat.kuleuven.be/pub/sista/vanwaterschoot/abstracts/07-29.html