• No results found

by incorporating prior knowledge

N/A
N/A
Protected

Academic year: 2021

Share "by incorporating prior knowledge"

Copied!
1
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Towards optimal regularization

by incorporating prior knowledge

Toon van Waterschoot and Marc Moonen

Katholieke Universiteit Leuven, ESAT-SCD, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium toon.vanwaterschoot@esat.kuleuven.ac.be

1 Regularization in least squares parameter estimation

Consider the linear estimation problem

y (1) y (2)

...

y (N )

=

u (1) u (0) . . . u (1 − n F ) u (2) u (1) . . . u (2 − n F )

... ... . .. ...

u(N ) u(N − 1) . . . u(N − n F )

·

 f 0 f 1 ...

f n

F

 +

v (1) v (2)

...

v(N )

 ,

or in matrix notation

y = Uf + v. (1)

An estimator for f may be obtained by minimizing the least squares (LS) criterion

min ˆ f

V LS (ˆ f ) = min

ˆ f

(y − Uˆ f ) T (y − Uˆ f ) (2) which results in the well-known LS estimator





ˆ f LS = (U T U ) −1 U T y. (3) When the input signal u(t) is not (or hardly) persistently ex- citing, the matrix U T U may be ill-conditioned or even singu- lar. A common solution is to apply Tikhonov regularization by adding a scaled identity matrix to U T U :









ˆ f RgLS = (U T U + αI) −1 U T y. (4) This is in fact the solution to a ridge regression problem as for- mulated by the criterion

min ˆ f

V RR (ˆ f ) = min

ˆ f

(y − Uˆ f ) T (y − Uˆ f ) + αkˆ f k 2 2 . (5) We address the question whether it is convenient to replace the scaled unity matrix αI by a non-identity, possibly non- diagonal regularization matrix P:









ˆ f RgLS = (U T U + P) −1 U T y, (6) and if it is, what is the optimal choice for P?

2 Linear mimimum mean square error estimation

Let us derive an expression for the minimum mean square error (MMSE) estimator of f :

min ˆ f

V M M SE (ˆ f ) = min

ˆ f

E (ˆ f − f ) T (ˆ f − f ) (7) under the following assumptions:

• The estimator is a linear function of the data in y:

ˆ f MMSE = Z T y. (8)

• The measurement noise in v is drawn from a stationary white noise process with zero mean and variance σ v 2 :

µ v , Ev = 0, (9)

R v , cov(v) = Evv T = σ v 2 I. (10)

• The true parameter vector f is considered as a random vari- able on which some prior knowledge may be available.

More specifically, let the prior probability density function (PDF) p(f ) be characterized by its first and second order mo- ments:

µ f , Ef, (11)

R f , cov(f) = E(f − Ef)(f − Ef) T . (12) Then the linear MMSE estimator can be obtained as the mean of the posterior PDF p(f |y) after the data have been recorded:

ˆ f MMSE = E(f |y) = µ+(U T R v −1 U +R f −1 ) −1 U T R v −1 (y−Uµ).

We will construct the prior knowledge on f in such a way that µ f = 0. Then, also using the white noise assumption in (10), the expression for ˆf MMSE can be rewritten as





ˆ f MMSE = (U T U + σ v 2 R f −1 ) −1 U T y. (13) From this point of view, applying Tikhonov regularization as in (4) is equivalent to assuming that the true parameter vector f is drawn from a stationary white noise process. In many ap- plications however, more information on the true parameter will be available and an appropriate non-identity covariance matrix R f may be constructed.

3 An example application:

Acoustic echo cancellation

Consider an acoustic echo cancellation (AEC) application where:

• the loudspeaker signal u(t) is an audio signal (typically speech),

• the near-end noise signal v(t) is assumed to be drawn from a stationary white noise process,

• the acoustic impulse response (AIR) F (q) = f 0 + f 1 q −1 + . . . + f n

F

q −n

F

, with q the time shift operator, is assumed to be stationary, and

• the microphone signal is made up of the echo signal x(t) = F (q)u(t) and the near-end noise signal: y(t) = x(t) + v(t).

Typically the loudspeaker signal exhibits a tonal spectrum and the AIR is of high order. Therefore the lack of persistent excitation may indeed be a problem in the AEC application.

If an LS estimator with Tikhonov regularization is applied in the AEC problem, the regularization matrix P = αI (with the regularization parameter typically chosen as α = σ v 2 ) looks like this:

0

20

40

60

80

1000

20

40

60

80

100 0

0.5 1 1.5

An acoustic impulse response has a very typical form, which may be characterized by three parameters:

• the initial delay, which corresponds to the time needed by the loudspeaker sound wave to reach the microphone through a direct path (i.e. without reflections),

• the direct path attenuation, which determines the peak re- sponse in the AIR, and

• the exponential decay rate, which models the reverberant tail of the AIR.

0 0.02 0.04 0.06 0.08 0.1 0.12

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6 0.8 1

t (s)

direct path attenuation

initial delay

exponential decay rate

These three parameters may be calculated from the acoustic setup (distance between loudspeaker and microphone, acous- tic absorption of the walls, room volume, etc.). Hence they can be considered as prior knowledge. If these three param- eters are taken into account, a diagonal regularization matrix may be constructed that looks like this:

0 20

40 60

80

100 0 10 20 30 40 50 60 70 80 90 100

0 20 40 60 80 100 120 140

An idealized case, which is interesting as a reference method, occurs when the true AIR is known. In this case a diagonal reg- ularization matrix may be constructed with the diagonal ele- ments equal to the inverse square values of the true parameter vector coefficients:

0

20

40

60 80

100 0

20

40

60

80

100 0

2000 4000 6000 8000 10000

4 Simulation results

Four different least squares estimators were compared for the AEC application: the LS estimator ˆf LS and the regularized LS estimator ˆf RgLS with Tikhonov regularization (P = σ v 2 I ), regu- larization based on the three AIR parameters described above (P = σ v 2 R −1 f ,synth ), and on the true AIR (P = σ v 2 R −1 f ,true ).

For each estimator, 100 simulation runs were performed with a different near-end noise realization.

Two types of loudspeaker signals were used: a sum of n F − 1 sinusoids with random frequencies, uniformly distributed in the interval between DC and the Nyquist frequency, and a male speech signal.

Two types of acoustic impulse responses were used: a hear- ing aid impulse response with n F + 1 = 100 coefficients, and a room impulse response with n F + 1 = 1000 coefficients.

0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24

ˆfRgLS

f−fk2 kfk2

ˆfRgLS ˆfRgLS

P = σv2I

ˆfLS

P = σv2R−1f ,synth P = σv2R−1f ,true

F (q) = hearing aid IR u(t) = Σ sin

0.05 0.1 0.15 0.2 0.25 0.3 0.35

f−fk2 kfk2

ˆfLS ˆfRgLS

P = σv2I

ˆfRgLS ˆfRgLS

P = σv2R−1f ,true P = σv2R−1f ,synth

u (t) = speech F (q) = hearing aid IR

0.1 0.12 0.14 0.16 0.18 0.2 0.22

f−fk2 kfk2

ˆfLS ˆfRgLS

P = σv2I

ˆfRgLS

P = σv2R−1f ,synth

ˆfRgLS

P = σv2R−1f ,true

u(t) = Σ sin F (q) = room IR

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2

f−fk2 kfk2

ˆfLS ˆfRgLS

P = σv2I

ˆfRgLS

P = σv2R−1f ,synth

ˆfRgLS

P = σv2R−1f ,true

u(t) = speech

F (q) = room IR

Referenties

GERELATEERDE DOCUMENTEN

Daarnaast heeft deze nieuwe bronstatistiek een betere dekking verkregen door gebruikmaking van ICP (Intra Communautaire Prestaties)-gegevens van de Belastingdienst voor de

Each failure saw the rampant revival of independent recruiting, and even attempts by some companies to expetiment on company propeny with alter- native systems such

For termination on (final state and) empty stack we have shown that we can find a linear specification of the finite control of the pushdown automaton, put it in parallel with

This makes the method suitable for the solution of large-scale Tikhonov minimization problems (1.5) with fairly general linear regularization operators L.. Section 3

De opgaven voor de tweede ronde van de Vlaamse Olympiade zijn in Euclides nooit gepubliceerd, omdat ze niet van specifiek Vlaams maar van Amerikaans origine zijn. Hier volgen een

Het verkeersproces is op te vatten als een dynamisch systeem. Op macro- scopische schaal verandert de toestand waarin het verkeerssysteem zich bevindt

Reneval-modellen:uit activiteitenprogramma en verlooppercentages voIgt de behoefte aan personeel c.q. de mogelijke promotiepercenta- ges. Overigens kunnen met beide

Part II Distributed signal processing algorithms for heterogeneous multi-task WSNs 111 5 Multi-task WSN for signal enhancement, MVDR beamforming and DOA estimation: single source