Conventional linear prediction methods

(1)

Linear prediction of audio signals

Toon van Waterschoot and Marc Moonen

Katholieke Universiteit Leuven, ESAT-SCD, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium toon.vanwaterschoot@esat.kuleuven.be

http://homes.esat.kuleuven.be/∼tvanwate/

Linear prediction audio applications:

• audio compression (warped LP)

• lossless audio coding

• audio (spectral) analysis

• audio signal whitening

(feedback/echo cancellation)

Tonal audio signal model, t = 1, ..., M : x(t) = P_N

n=1 A(n) sin[nω₀t + φ(n)] + e(t) Residual spectral flatness measure:

SFM_R = ^exp n

1 M

P_fs^˜

f =0ln

|R(f )|²o

1 M

P_fs^˜

f =0 |R(f )|²

0 0.5 1 1.5 2

x 10⁴

−300

−250

−200

−150

−100

−50 0 50 100

f (Hz) 20log10|X(ej2πf/fs)|(dB)

• f_s = 44.1 kHz

• ω₀ = 2π/64 rad (f₀ = 689.1 Hz)

• M = 2048 samples = 46.4 ms

• N = 15

• e(t) ≡ 0

• A(n) ∼ U(0, 1)

• φ(n) ∼ U(0, 2π)

LP method SFM_R (dB) LP (autocorrelation) -6.92

LP (covariance) -52.97 Pole-zero model -1.19

High-order LP -0.55 Pitch prediction -15.37

Warped LP -5.17 Selective LP -1.76

Conventional linear prediction methods

'

&

$

%

Autocorrelation method

0 0.5 1 1.5 2

x 10⁴

−30

−20

−10 0 10 20 30

f (Hz) 20log10|H(ej2πf/fs )|(dB)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−1 −0.5 0 0.5 1

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6 0.8 1

30

Real Part

Imaginary Part

'

&

$

%

Covariance method

0 0.5 1 1.5 2

x 10⁴

−200

−150

−100

−50 0 50 100 150

f (Hz) 20log10|H(ej2πf/fs)|(dB)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−1 −0.5 0 0.5 1

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6 0.8 1

30

Real Part

Imaginary Part

⇓ ⇓

residual spectral flatness maximization high-resolution tonal components identification

Conjectures:

1. AR(2N ) model is not suited for achieving BOTH residual spectral flatness and and high-resolution tonal components identification, UNLESS tonal components are uniformly distributed in Nyquist interval (ω

₀

= π/(N + 1))

2. AR(2N ) model with covariance method fails to identify tonal components when noise is added to tonal signal (e(t) 6= 0)

Alternative linear prediction methods

'

&

$

%

Pole-zero modelling

0 0.5 1 1.5 2

x 10⁴

−140

−120

−100

−80

−60

−40

−20 0 20

−1 −0.5 0 0.5 1

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6 0.8 1

Real Part

Imaginary Part

'

&

$

%

High-order AR modelling

0 0.5 1 1.5 2

x 10⁴

−50

−40

−30

−20

−10 0 10 20

−1 −0.5 0 0.5 1

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6 0.8 1

2

2 2

2 1024

Real Part

Imaginary Part

'

&

$

%

Pitch prediction

0 0.5 1 1.5 2

x 10⁴

−35

−30

−25

−20

−15

−10

−5 0 5 10

−1 −0.5 0 0.5 1

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6 0.8 1

64

Real Part

Imaginary Part

'

&

$

%

Warped linear prediction

0 0.5 1 1.5 2

x 10⁴

−40

−30

−20

−10 0 10 20 30 40 50 60

f˜(Hz) 20log10|X(ej2π

˜ f/fs)|(dB)

0 0.5 1 1.5 2

x 10⁴

−40

−30

−20

−10 0 10 20

f˜(Hz) 20log10|H(ej2π

˜ f/fs )|(dB)

−1 −0.5 0 0.5 1

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6 0.8 1

30

Real Part

Imaginary Part

'

&

$

%

Selective linear prediction

0 2000 4000 6000 8000 10000

−300

−250

−200

−150

−100

−50 0 50 100

f (Hz) 20log10|X(ej2πKf/fs)|(dB)

0 2000 4000 6000 8000 10000

−30

−20

−10 0 10 20 30

f (Hz) 20log10|H(ej2πKf/fs)|(dB)

−1 −0.5 0 0.5 1

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6 0.8 1

30

Real Part

Imaginary Part

See also: T. van Waterschoot and M. Moonen, “Comparison of linear prediction models for audio signals,” EURASIP J. Audio, Speech, Music Process., submitted for publication. ftp:

//ftp.esat.kuleuven.be/pub/sista/vanwaterschoot/abstracts/07-29.html