Embedded optimization algorithms for multi-microphone dereverberation

(1)

Embedded optimization algorithms for

multi-microphone dereverberation

Toon van Waterschoot, Bruno Defraene,

Moritz Diehl, and Marc Moonen

STADIUS Center for Dynamical Systems,

Signal Processing, and Data Analytics

ESAT – Department of Electrical Engineering

KU Leuven, Belgium

(2)

0 – Outline

2/19

1

Introduction

2

Problem statement

3

Embedded optimization algorithms

4

Evaluation

(3)

1 – Outline

3/19

1

Introduction

2

Problem statement

3

Embedded optimization algorithms

4

Evaluation

(4)

1 – Multi-microphone dereverberation

4/19

Overview of dereverberation methods

beamforming approach speech enhancement approach

blind system identification and inversion approach

Blind system identification and inversion

two-stepprocedure:

I blind room impulse responses (RIRs) identification I inverse filter design using identifed RIRs

inversion step is oftenill-posed(high sensitivity to RIR estimation errors, near-common RIR zeros, ...)

(5)

1 – Embedded optimization

5/19

Multi-microphone dereverberation by embedded optimization

direct estimationof source signal of interest (jointly with auxiliary variables such as RIRs)

“intermediate” inverse filter design step is skipped, thusavoiding ill-posedness

more flexibilitythan “traditional” recursive implementations of closed-form estimators

(6)

2 – Outline

6/19

1

Introduction

2

Problem statement

3

Embedded optimization algorithms

4

Evaluation

(7)

2 – Problem statement

7/19

Multi-microphone sound acquisition

Assumptions

batch operation (t = 1, . . . , N ), zero initial source signal conditions time-invariant RIRs, with fixed and known length L ≤ N

no measurement noise

Problem (Multi-microphone dereverberation)

Given a length-M N vector y of microphone signals,

(8)

2 – Data model

8/19

True system

y =







y

1

..

.

y

M







=







H

1,0

..

.

H

M,0







s

0

= H

0

s

0

(1)

Parameter vectors to be estimated

RIRs: h =

h

T

1

. . . h

TM

T

source signal vector: s =

s(1) . . . s(N )

T

estimation error signal vector: e =

e

T

1

. . . e

TM

T

Data model (two equivalent formulations)

y = Hs + e

(2)

(9)

3 – Outline

9/19

1

Introduction

2

Problem statement

3

Embedded optimization algorithms

4

Evaluation

(10)

3 – Embedded optimization algorithms

10/19

Rationale

joint estimationof unknown parameter vectors h and s incorporate prior knowledge throughregularization

adopt sequential minimization approach (block coordinate descent, BCD) to deal with nonlinearity

solve linearized subproblems using (convex)numerical optimization tools

Proposed problem formulations

NLS: nonlinear least squares

`2-RNLS: `2-regularized NLS, exploiting prior knowledge on h

(11)

3 – NLS problem

11/19

Problem (NLS)

min

h,s,e

kek

2 2

(4)

s. t.

y = Hs + e

(5)

= (I

M

⊗ S) h + e.

(6)

BCD Solution strategy

minimize (4),(5) w.r.t. {s, e} for a fixed value of h = ˆh minimize (4),(6) w.r.t. {h, e} for a fixed value of s = ˆs repeat this procedure for a number of iterations

Properties

multiplelocal solutionsto NLS problem exist

(12)

3 – `

2

-regularized NLS problem

12/19

Problem (`

2

-RNLS)

min

h,s,e

kek

2 2

+ λ

1

khk

2W

(7)

s. t.

y = Hs + e

(8)

= (I

M

⊗ S) h + e

(9)

Properties

`2-norm regularization results insmoothingeffect, facilitating

convergence to meaningful local solution

regularization matrix W allows to incorporate prior knowledge on h, e.g., using Polack’smodel for late reverberation(with fixed decay α)

W = IM ⊗ diag

n

(13)

3 – `

1

/`

2

-regularized NLS problem

13/19

Problem (`

1

/`

2

-RNLS)

min

h,s,e

kek

2 2

+ λ

1

khk

2W

+ λ

2

kDsk

1

(11)

s. t.

y = Hs + e

(12)

= (I

M

⊗ S) h + e.

(13)

Properties

`1-norm regularization results inadditional smoothingeffect

kDsk1 promotessource signal sparsityin basis defined by D

sparsity inspectral basis(DFT, DCT, ...) is meaningful for speech/audio minimization w.r.t. {s, e} does not yield closed-form solution, but allows

efficient numerical solutionusing convex optimization tools (e.g., interior point method, method of multipliers, ...)

(14)

4 – Outline

14/19

1

Introduction

2

Problem statement

3

Embedded optimization algorithms

4

Evaluation

(15)

4 – Simulation setup

15/19

Acoustic scenario

source signal = quasi-stationary voiced speech segment source signal length N = 1024 at fs= 8 kHz

M = 5 microphones

RIRs = GWN shaped by Polack’s model envelope (α = 0.025) RIRs length L = 100

Algorithm parameters

regularization parameters λ1= λ2= 0.1

D = DCT matrix

random (GWN) initialization of RIRs parameter vector estimate fixed number of kmax= 10 iterations

(16)

4 – RIR estimation results

16/19

Estimated RIR for microphone m = 2

Observations

NLS estimate corresponds to erroneouslocal solution

`2-norm regularization improvesoverall estimation performance

= improved RIR envelope

`1-norm regularization improveslocal estimation performance

(17)

4 – Source signal estimation results

17/19

Estimated source signal magnitude spectrum

Observations

NLS estimate corresponds to inaccuratelocal solution

`2-norm regularization improvesoverall estimation performance

= improved source signal spectrum envelope

`1-norm regularization improveslocal estimation performance

(18)

5 – Outline

18/19

1

Introduction

2

Problem statement

3

Embedded optimization algorithms

4

Evaluation

(19)

5 – Conclusion

19/19

Multi-microphone dereverberation by embedded optimization

allowsmore generalproblem formulations (6= closed-form estimators) achievesdirectsource signal estimation (6= inverse filter design) facilitates use ofprior knowledge

yieldspromisingsimulation results

Future research challenges

batch →online(frame-based) processing synthetic →realisticRIRs

zero → non-zero measurementnoise

general-purpose →fastdedicated numerical optimization tools includeperceptualcriteria in problem formulation