Binaural multi-channel Wiener filtering for hearing aids:
Preserving interaural time and level differences
1,2 T.J. Klasen, 1 Simon Doclo, 1,2 Tim Van den Bogaert, 1 Marc Moonen, 2 Jan Wouters
1 KU Leuven ESAT, Kasteelpark Arenberg 10, Leuven 2 KU Leuven Lab. ORL, Kapucijnenvoer 33, Leuven tklasen@esat.kuleuven.be
Introduction
•Hearing impaired persons localize sounds better without their bilateral hearing aids than with them.
•Current hearing aids are not designed to preserve localiza- tion cues
•Advantages of preserving localization cues – Visual cues ⇒ Improvement in intelligibility – Spatial separation ⇒ Improvement in intelligibility
Interaural localization cues
•Interaural time difference (ITD)
– ITD is difference in arrival of signal between ears – ITD cues reside in low frequencies < 1500Hz
•interaural level difference (ILD) – ILD is intensity difference between ears – ILD cues reside in high frequencies > 3000Hz
State of the art
•Binaural Wiener filter ⇒ Preserves speech ITD cues
•Controlled binaural Wiener filter ⇒ Preserves noise ITD cues at cost of noise reduction
•Extended cost function includes ITD and ILD terms ⇒ Iter- ative optimization techniques
System model
Speaker
Hearing aid user
Noise θ φ
YL0(ω) · · · YLM−1(ω) YR0(ω) · · · YRM−1(ω)
ZR1(ω) ZL0(ω)
WL(ω) WR(ω)
•Signals received at the mth microphone pair Y L
m(ω) = X L
m(ω)
| {z } Speech
+ V L
m(ω)
| {z } N oise
Y R
m(ω) = X R
m(ω)
| {z } Speech
+ V R
m(ω)
| {z } N oise
•2M-dimensional signal vector Y(ω) =
Y L
0(ω) . . . Y L
M−1(ω)Y R
0(ω) . . . Y R
M−1(ω) T Y (ω) = X(ω) + V(ω)
•Left and Right2M-dimensional filters W (ω) =
W L (ω) W R (ω)
=
"
W L
0(ω) . . . W L
2M−1(ω) T
W R
0(ω) . . . W R
2M−1(ω) T
#
Interaural transfer function (ITF)
•Input and Output ITFs (speech and noise) IT F X
des= X L
0X R
0IT F V
out(W) = W H L V W H
R V
•Desired ITFs of the speech and noise components – In function of the desired angles θ X and θ V , and fre-
quency, ω
IT F X
des= HRT F X
L(ω, θ X ) HRT F X
L(ω, θ X ) – As original ITFs
IT F X
des= E n X L
0X R ∗
0o E n
X R
0X R ∗
0o IT F V
des= E n V L
0V R ∗
0o E n
V R
0V R ∗
0o
•Preserve binaural cues ⇒ original ITFs as desired ITFs
Binaural Wiener filtering
•Original cost function J(W) = E
X L
0− W H L X X R
0− W H R X
2
| {z }
Speech Distortion + µ
W H L V W H
R V
2
| {z }
Residual N oise
•Goal: Output speech and noise parallel to desired ITFs
R I
IT FVdes 1
WHLV WH
RV
k to IT FVdes
1
WLHV WRHV
⊥ to IT FVdes 1
WLHV WRHV
•Add ITF terms to cost function minimize perpendicular part J(W) = E
(
X L
0− W H L X X R
0− W H R X
2
+ µ
W H L V W H R V
2
| {z }
Original SDW Cost F unction +
α
W H L X W H
R X
⊥ 2
+ β
W H L V W H
R V
⊥ 2
| {z }
Additional IT F T erms )
•Rewrite using definition of the cross product J (W) = E
(
X L
0− W H L X X R
0− W H R X
2
+ µ
W H L V W H
R V
2
+
α W H
L X − IT F X
desW H R X
2
IT F X
des1
2 + β
W H
L V − IT F V
desW H R V
2
IT F V
des1
2
) .
•Take derivative of J(W), set to zero, and solve for W W =
E
R R
X
+ µR R
V+ αR R
XC+ βR R
V C−1 E
r X
where, r X =
"
X L ∗
0X X ∗ R
0X
# R X = XX H R V = VV H
R R
X
=
R X 0 2M 0 2M R X
R R
V
=
R V 0 2M 0 2M R V
R R
XC
=
R X −IT F X ∗
des
R X
−IT F X
desR X |IT F X
des| 2 R X
R R
V C
=
R V −IT F V ∗
des
R V
−IT F V
desR V |IT F V
des| 2 R V
Simulations
Setup
• T 60 = 0.76 sec, f s = 16 kHz, and FFT size = 256
•HINT speech at 345 degrees and HINT noise at 60 degrees
•Input SNR Left 2.8dB Right -6.8dB
•GN ReSound Canta behind the ear hearing aids on CORTEX MK2 artificial head
•Varied α and β from 0 to 100 with µ = 1 Performance measures
•ITD Error (N bins < 1500Hz) 1
N X N
i=1
1 − cos
6 E n X L
0X R ∗
0o
− 6 E n
W H L X (W H R X ) ∗ o
•ILD Error (All N bins) 1 N
X N
i=1
10 log 10 P L
in(ω i )
P R
in(ω i ) − 10 log 10 P L
out(ω i ) P R
out(ω i )
•Improvement in speech intelligibility weighted signal-to- noise-ratio (SNR INT )
SNR INT = X J
j=1 w j SNR j
Results
0 50 100 0
50 100 0
0.1 0.2 0.3 0.4 0.5
alpha ITD Error Speech Component
beta
ITD Error
0 50 100 0
50 100 0
0.1 0.2 0.3 0.4 0.5
alpha ITD Error Noise Component
beta
ITD Error
0 50 100 0
50 100 0
2 4 6 8 10 12
alpha ILD Error Speech Component
beta
ILD Error (dB)
0 50 100 0
50 100 0
2 4 6 8 10 12
alpha ILD Error Noise Component
beta
ILD Error (dB)
0 50 100 0
50 100 9
10 11 12 13 14
alpha Output Intelligibility Weighted SNR Left Microphone
beta
Intelligibility Weighted SNR (dB)
0 50 100 0
50 100 9
10 11 12 13 14
alpha Output Intelligibility Weighted SNR Right Microphone
beta
Intelligibility Weighted SNR (dB)