Embedded optimization algorithms for
multi-microphone dereverberation
Toon van Waterschoot, Bruno Defraene,
Moritz Diehl, and Marc Moonen
STADIUS Center for Dynamical Systems,
Signal Processing, and Data Analytics
ESAT – Department of Electrical Engineering
KU Leuven, Belgium
0 – Outline
2/19
1
Introduction
2
Problem statement
3
Embedded optimization algorithms
4
Evaluation
1 – Outline
3/19
1
Introduction
2
Problem statement
3
Embedded optimization algorithms
4
Evaluation
1 – Multi-microphone dereverberation
4/19
Overview of dereverberation methods
beamforming approach speech enhancement approach
blind system identification and inversion approach
Blind system identification and inversion
two-stepprocedure:
I blind room impulse responses (RIRs) identification I inverse filter design using identifed RIRs
inversion step is oftenill-posed(high sensitivity to RIR estimation errors, near-common RIR zeros, ...)
1 – Embedded optimization
5/19
Multi-microphone dereverberation by embedded optimization
direct estimationof source signal of interest (jointly with auxiliary variables such as RIRs)
“intermediate” inverse filter design step is skipped, thusavoiding ill-posedness
more flexibilitythan “traditional” recursive implementations of closed-form estimators
2 – Outline
6/19
1
Introduction
2
Problem statement
3
Embedded optimization algorithms
4
Evaluation
2 – Problem statement
7/19
Multi-microphone sound acquisition
Assumptions
batch operation (t = 1, . . . , N ), zero initial source signal conditions time-invariant RIRs, with fixed and known length L ≤ N
no measurement noise
Problem (Multi-microphone dereverberation)
Given a length-M N vector y of microphone signals,
2 – Data model
8/19
True system
y =
y
1..
.
y
M
=
H
1,0..
.
H
M,0
s
0= H
0s
0(1)
Parameter vectors to be estimated
RIRs: h =
h
T1
. . . h
TM Tsource signal vector: s =
s(1) . . . s(N )
Testimation error signal vector: e =
e
T1
. . . e
TM TData model (two equivalent formulations)
y = Hs + e
(2)
3 – Outline
9/19
1
Introduction
2
Problem statement
3
Embedded optimization algorithms
4
Evaluation
3 – Embedded optimization algorithms
10/19
Rationale
joint estimationof unknown parameter vectors h and s incorporate prior knowledge throughregularization
adopt sequential minimization approach (block coordinate descent, BCD) to deal with nonlinearity
solve linearized subproblems using (convex)numerical optimization tools
Proposed problem formulations
NLS: nonlinear least squares
`2-RNLS: `2-regularized NLS, exploiting prior knowledge on h
3 – NLS problem
11/19
Problem (NLS)
min
h,s,ekek
2 2(4)
s. t.
y = Hs + e
(5)
= (I
M⊗ S) h + e.
(6)
BCD Solution strategy
minimize (4),(5) w.r.t. {s, e} for a fixed value of h = ˆh minimize (4),(6) w.r.t. {h, e} for a fixed value of s = ˆs repeat this procedure for a number of iterations
Properties
multiplelocal solutionsto NLS problem exist
3 – `
2-regularized NLS problem
12/19
Problem (`
2-RNLS)
min
h,s,ekek
2 2+ λ
1khk
2W(7)
s. t.
y = Hs + e
(8)
= (I
M⊗ S) h + e
(9)
Properties
`2-norm regularization results insmoothingeffect, facilitating
convergence to meaningful local solution
regularization matrix W allows to incorporate prior knowledge on h, e.g., using Polack’smodel for late reverberation(with fixed decay α)
W = IM ⊗ diag
n
3 – `
1/`
2-regularized NLS problem
13/19
Problem (`
1/`
2-RNLS)
min
h,s,ekek
2 2+ λ
1khk
2W+ λ
2kDsk
1(11)
s. t.
y = Hs + e
(12)
= (I
M⊗ S) h + e.
(13)
Properties
`1-norm regularization results inadditional smoothingeffect
kDsk1 promotessource signal sparsityin basis defined by D
sparsity inspectral basis(DFT, DCT, ...) is meaningful for speech/audio minimization w.r.t. {s, e} does not yield closed-form solution, but allows
efficient numerical solutionusing convex optimization tools (e.g., interior point method, method of multipliers, ...)
4 – Outline
14/19
1
Introduction
2
Problem statement
3
Embedded optimization algorithms
4
Evaluation
4 – Simulation setup
15/19
Acoustic scenario
source signal = quasi-stationary voiced speech segment source signal length N = 1024 at fs= 8 kHz
M = 5 microphones
RIRs = GWN shaped by Polack’s model envelope (α = 0.025) RIRs length L = 100
Algorithm parameters
regularization parameters λ1= λ2= 0.1
D = DCT matrix
random (GWN) initialization of RIRs parameter vector estimate fixed number of kmax= 10 iterations
4 – RIR estimation results
16/19
Estimated RIR for microphone m = 2
Observations
NLS estimate corresponds to erroneouslocal solution
`2-norm regularization improvesoverall estimation performance
= improved RIR envelope
`1-norm regularization improveslocal estimation performance
4 – Source signal estimation results
17/19
Estimated source signal magnitude spectrum
Observations
NLS estimate corresponds to inaccuratelocal solution
`2-norm regularization improvesoverall estimation performance
= improved source signal spectrum envelope
`1-norm regularization improveslocal estimation performance
5 – Outline
18/19
1
Introduction
2
Problem statement
3
Embedded optimization algorithms
4
Evaluation
5 – Conclusion
19/19
Multi-microphone dereverberation by embedded optimization
allowsmore generalproblem formulations (6= closed-form estimators) achievesdirectsource signal estimation (6= inverse filter design) facilitates use ofprior knowledge
yieldspromisingsimulation results
Future research challenges
batch →online(frame-based) processing synthetic →realisticRIRs
zero → non-zero measurementnoise
general-purpose →fastdedicated numerical optimization tools includeperceptualcriteria in problem formulation