Noise reduction and binaural Noise reduction and binaural
cue preservation of multi- cue preservation of multi-
microphone algorithms microphone algorithms
Simon Doclo, Tim van den Bogaert, Marc Moonen, Jan Wouters
Dept. of Electrical Engineering (ESAT-SCD), KU Leuven, Belgium Dept. of Neurosciences (ExpORL), KU Leuven, Belgium
Oldenburg, June 28 2007
2
Overview Overview
• Problem statement
o Improve speech intelligibility + preserve spatial awareness o Bilateral vs. binaural processing
• Binaural signal processing using multi-channel Wiener filter
o MWF: noise reduction and preservation of speech cues, noise cues are distorted
o Extension of MWF to preserve binaural cues of all components:
– MWFv: partial estimation of noise component
– MWF-ITF: extension with Interaural Transfer Function
o Physical and perceptual evaluation
• Reduce bandwidth requirements of wireless link
o Distributed binaural MWF
33
Problem statement Problem statement
• Many hearing impaired are fitted with a hearing aid at both ears
o Signal processing to selectively enhance useful speech signal and improve speech intelligibility
o Signal processing to preserve directional hearing and spatial
awarenesso Multiple microphone available: spectral + spatial processing
• Binaural auditory cues:
o Interaural Time Difference (ITD) – Interaural Level Difference (ILD) o Binaural cues, in addition to spectral and temporal cues, play an
important role in binaural noise reduction and sound localisation o ITD: f < 1500Hz, ILD: f > 2000Hz
IPD/ITD
ILD
Problem statement -bilateral/binaural
Binaural processing
Bandwidth reduction
Conclusions
4
Bilateral vs. Binaural Bilateral vs. Binaural
Bilateral system
Independent left/right processing:
Preservation of binaural cues for localisation ?
Binaural system
More microphones:
better performance ?
preservation of binaural cues ? Need of binaural link
Problem statement -bilateral/binaural
Binaural processing
Bandwidth reduction
Conclusions
55
• Bilateral system:
o Independent processing of left and right hearing aid o Localisation cues are distorted
RMS error per loudspeaker when accumulating all responses of the different test conditions (NH = normal hearing, NO = hearing impaired without hearing aids, O = omnidirectional configuration, A = adaptive directional configuration)
[Van den Bogaert et al., 2006]
Bilateral vs. Binaural Bilateral vs. Binaural
Problem statement -bilateral/binaural
Binaural processing
Bandwidth reduction
Conclusions
also effect on intelligibility through binaural hearing advantage
7
• Bilateral system:
o Independent processing of left and right hearing aid o Localisation cues are distorted
• Binaural system:
o Cooperation between left and right hearing aid (e.g. wireless link) o Assumption : all microphone signals are available at the same time
Objectives/requirements for binaural algorithm:
1. SNR improvement: noise reduction, limit speech distortion
2. Preservation of binaural cues (speech/noise) to exploit binaural hearing advantage
3. No assumption about position of speech source and microphones
[Van den Bogaert et al., 2006]
Bilateral vs. Binaural Bilateral vs. Binaural
Problem statement -bilateral/binaural
Binaural processing
Bandwidth reduction
Conclusions
1010
Configuration and signals Configuration and signals
• Configuration: microphone array with
M microphones at left and right hearing aid, communication between hearing aids
noise component
0,m
( ) =
0,m( ) V
0,m( ) , = 0
01 Y
0,m( ) = X
0,m( )
0,m( ) , m = 0 M
0 1 Y X V m M
speech component
0
( ) =
0H( ) ( ),
1( ) =
1H( ) ( ) Z W Y Z W Y
• Use all microphone signals to compute output signal at both ears
Problem statement
Binaural processing -MWF
-Cue preservation -Physical evaluation -Perceptual eval
Bandwidth reduction
Conclusions
11
Overview of cost functions Overview of cost functions
Multi-channel Wiener filter (MWF): MMSE estimate of speech component in microphone signal at both ears trade-off noise reduction
and speech distortion
Speech-distortion weighted multi-channel Wiener filter (SDW-MWF)
[Doclo 2002, Spriet 2004]
binaural cue preservation of speech + noise
Partial estimation of noise component (MWFv)
[Klasen 2005]
Extension with ITD-ILD or Interaural Transfer
Function (ITF)
[Doclo 2005, Klasen 2006, Van den Bogaert 2007]
Problem statement
Binaural processing -MWF
-Cue preservation -Physical evaluation -Perceptual eval
Bandwidth reduction
Conclusions
1212
• Binaural SDW-MWF: estimate of speech component in microphone signal at both ears (usually front microphone) + trade-off between noise reduction and speech distortion
Binaural multi-channel Wiener filter Binaural multi-channel Wiener filter
0
1
=
x v M, =
x,
x y vM x v x
R R 0 r
R r R R R
0 R R r
0
1
2 2
0, 0 0
1
1, 1
( )
H H
r
H H
r
J E X
X
W X W V
W W X W V W
SDW= R r
1speech component
in front microphonesspeech distortion noise reduction
Problem statement
Binaural processing -MWF
-Cue preservation -Physical evaluation -Perceptual eval
Bandwidth reduction
Conclusions
estimate o Depends on second-order statistics of speech and noise
o Estimate Ry during speech-dominated time-frequency segments, estimate Rv during noise-dominated segments, requiring robust voice activity
detection (VAD) mechanism
o No assumptions about positions of microphones and sources
o Adaptive (LMS-based) algorithm available [Spriet 2004, Doclo 2007]
13
Binaural multi-channel Wiener filter Binaural multi-channel Wiener filter
• Interpretation for single speech source:
o Spectral and spatial filtering operation
with (spatial) coherence matrix and P (spectral) power
o Equivalent to superdirective beamformer (diffuse noise field) or delay-and-sum beamformer (spatially white noise field)
+
single-channel WF-based postfilter (spectral subraction)
Spatial separation between
speech and noise sources SNR
0
1 1
*
,0 1 1
/
0,H
v v
SDW H H r
v v v s
P P A
Γ A A Γ A W AΓ A A Γ A
• Binaural cues (ITD-ILD) :
Perfectly preserves binaural cues of speech component Binaural cues of noise component speech component !!
Problem statement
Binaural processing -MWF
-Cue preservation -Physical evaluation -Perceptual eval
Bandwidth reduction
Conclusions
1414
• Partial estimation of noise component
o Estimate of sum of speech component and scaled noise component
o Relationship with SDW-MWF: mix with reference microphone signals
reduction of noise reduction performance works for multiple noise sources
Partial noise estimation (MWFv) Partial noise estimation (MWFv)
0
1
0
1
0,
2
0, 0
1, 1, 1
( )
r Hr
H r
r
X X
V
J E V
W W Y
W Y
0
1 0
1
2 2
0, 0 0
1,
0,
1 1, 1
0 1
( )
r,
r
H H
r
H H
r
J X V
E X V
W X W V
W W X W V
0
1
0 0, ,0
1 1, ,1
(1 ) (1 )
r SDW
r SDW
Z Y Z
Z Y Z
Problem statement
Binaural processing -MWF
-Cue preservation -Physical evaluation -Perceptual eval
Bandwidth reduction
Conclusions
15
Interaural Wiener filter (MWF-ITF) Interaural Wiener filter (MWF-ITF)
• Extension of SDW-MWF with binaural cues
o Add term related to binaural cues of noise (and speech) component
o Possible cues: ITD, ILD, Interaural Transfer Function (ITF)
( ) = ( ) x ( ) v ( )
tot SDW cue cue
J W J W
J W
J W0 0
1 1
H
v v
out H
v
ITF Z
Z W V W V
0 1
0
1 1 1
* 0, 1,
0, 0 1
1, 1, 1,* 1 1
( , ) ( , )
r r
v r v
in
r r r v
E V V
V r r
ITF V E V V R r r
e.g.
R
0
1
2 2
0, 0 0
1
1, 1
2 2
0 1 0 1
( ) =
H H
r
tot H H
r
H x H H v H
in in
J E X
X
E ITF E ITF
W X W V
W W X W V
W X W X W V W V
ITF preservation speech ITF preservation noise
o Closed form expression!
o large changes direction of speech increase weight o Implicit assumption of single noise source
Problem statement
Binaural processing -MWF
-Cue preservation -Physical evaluation -Perceptual eval
Bandwidth reduction
Conclusions
1717
Simulation setup Simulation setup
• Identification of HRTFs:
o Binaural recordings on CORTEX MK2 artificial head
o 2 omni-directional microphones on each hearing aid (d=1cm) o LS = -90:15:90, 90:30:270, 1m from head
o Room reverberation: T
60=140 ms (and T
60=510 ms)
Problem statement
Binaural processing -MWF
-Cue preservation -Physical evaluation -Perceptual eval
Bandwidth reduction
Conclusions
18
Experimental results Experimental results
• Simulations:
o S
xN
y, SNR = 0 dB on left front microphone (broadband) o f
s= 20.48 kHz
• MWF algorithmic parameters:
o batch procedure, perfect VAD o L=96, =5
o MWFv for different , MWF-ITF for different ,
• Physical evaluation:
o Speech = HINT, noise = babble noise o Speech intelligibility: SNR
o Localisation: ITD / ILD
• Perceptual evaluation:
o Preliminary study with NH subjects o Speech intelligibility: SRT
o Localisation: localise S and N
Problem statement
Binaural processing -MWF
-Cue preservation -Physical evaluation -Perceptual eval
Bandwidth reduction
Conclusions
1919
Physical evaluation Physical evaluation
• Performance measures:
o Intelligibility weighted SNR improvement (left/right)
o ILD error (speech/noise component) power ratio
x x
x out i in i
i
ILD ILD ILD
o ITD error (speech/noise component) phase of cross-correlation
x
i x
i iITD I ITD
1* *
0,0 1, 0 1
{ } { }
x i r r x x
ITD E X X E Z Z
L i L i
i
SNR I SNR
importance of i-th frequency for speech intelligibility
low-pass filter 1500 Hz
Problem statement
Binaural processing -MWF
-Cue preservation -Physical evaluation -Perceptual eval
Bandwidth reduction
Conclusions
0 0.2 0.4 0.6 0.8 1 0
10 20
SNR left [dB]
0 0.2 0.4 0.6 0.8 1
0 10 20
SNR right [dB]
0 0.2 0.4 0.6 0.8 1
0 2 4 6
ILD speech [dB]
0 0.2 0.4 0.6 0.8 1
0 2 4 6
ILD noise [dB]
0 0.2 0.4 0.6 0.8 1
0 0.5 1
ITD speech [rad]
auditec 60deg (=5, L=96, N=4)
0 0.2 0.4 0.6 0.8 1
0 0.5 1
ITD noise [rad]
Physical evaluation: MWFv Physical evaluation: MWFv
S
0N
602323
• Procedure:
o headphone experiments, using measured HRTFs
o Filters are calculated off-line on VU speech-weighted noise as S and multitalker babble noise as N
o All stimuli presented at comfort level, 5 NH subjects (ongoing)
Perceptual evaluation Perceptual evaluation
Headphones HRTFx
HRTFv speech
noise
G Binaural
filter
Mic L
R
Problem statement
Binaural processing -MWF
-Cue preservation -Physical evaluation -Perceptual eval
Bandwidth reduction
Conclusions
• Speech intelligibility:
o Adaptive procedure to find 50% Speech Reception Threshold (SRT)
• Localisation:
o S and N components (telephone) are sent separately through filter o Localise S and N in room where HRTFs were measured
o Level roving 6 dB, 3 repetitions per condition for each subject
N270 N315 S0 S45 N60 S90 24
-90 -75 -60 -45 -30 -15 0 15 30 45 60 75 90
v a mwf02 0dB
Perceptual evaluation: MWFv Perceptual evaluation: MWFv
• Algorithms: unprocessed, state-of-the art bilateral, MWF, MWFv (=0.2)
• Conditions: S0
N
60, S
45N
315and S
90N
270(T
60=510 ms)N270 N315 S0 S45 N60 S90 -90
-75 -60 -45 -30 -15 0 15 30 45 60 75 90
v a unproc 0dB
N270 N315 S0 S45 N60 S90 -90
-75 -60 -45 -30 -15 0 15 30 45 60 75 90
v a classic 0dB
N270 N315 S0 S45 N60 S90 -90
-75 -60 -45 -30 -15 0 15 30 45 60 75 90
v a mwf0 0dB
Problem statement
Binaural processing -MWF
-Cue preservation -Physical evaluation -Perceptual eval
Bandwidth reduction
Conclusions
2525
Perceptual evaluation: MWFv Perceptual evaluation: MWFv
• With state-of-the-art systems: preservation of binaural cues only within central angle of frontal hemisphere.
• Binaural MWF:
o preserves localization cues for speech source
o preserves localization cues for noise source(s) with small mixing
o Recent SRT experiments (N=2) show no substantial SRT difference between =0 and =0.2
• Ongoing research:
o Perceptual evaluation (SRT and localisation) for MWF-ITF
Problem statement
Binaural processing -MWF
-Cue preservation -Physical evaluation -Perceptual eval
Bandwidth reduction
Conclusions
28
Bandwidth constraints Bandwidth constraints
• Binaural MWF:
o 2M microphone signals are transmitted over wireless link
• Reduce bandwidth requirement of wireless link:
o Transmit one signal from contralateral ear
Problem statement
Binaural processing
Bandwidth reduction
Conclusions
– Front contralateral microphone signal
– Output of contralateral fixed (e.g. superdirective) beamformer – Output of MWF using only M contralateral microphone signals – Iterative distributed binaural MWF scheme
2929
Physical evaluation Physical evaluation
60 90 120 180 270 300 -60 60 -120 120 120 210 60 120 180 210 60 120 180 270 8
10 12 14 16 18 20
22 Performance comparison of MWF-based binaural algorithms
noise source(s) angle (°)
AI weighted SNR improvement (dB)
MWF-full MWF-front MWF-contra MWF-iter
Problem statement
Binaural processing
Bandwidth reduction
Conclusions
Performance of dB-MWF close to full binaural MWF !
3030
Contralateral directivity pattern Contralateral directivity pattern
T60=140 ms S0N120
Left HA -50
-45 -40 -35
30 210
60 240
90 270
120
300
150
330
180 0
Left HA - contralateral (N=4,120 deg,=5,SNR=14.6277)
Fullband
SNR=14.6dB B-MWF
-45 -40 -35 -30
30 210
60 240
90 270
120
300
150
330
180 0
Left HA - front contralateral (N=3)
Fullband
MWF-front
SNR=10.5dB
-55 -50 -45 -40 -35
30 210
60 240
270
120
300
150
330
180 0
Left HA - MWF contralateral (N=4,120 deg,=5,SNR=14.2051)
Fullband
MWF-contra
SNR=14.2dB
-50 -45 -40 -35
30 210
60 240
270
120
300
150
330
180 0
dB-MWF
SNR=14.2dB
3131
Conclusions Conclusions
• State-of-the art signal processing in (bilateral) HAs:
preservation of binaural cues only within central angle of frontal hemisphere
• Binaural MWF:
o Substantial noise reduction (MWF 4 3 > 2) o Preservation of binaural speech cues
o Distortion of binaural noise cues
o No assumptions about positions and microphones VAD
• Compromise between noise reduction and binaural cue preservation can be achieved with extensions of MWF
o Mixing with microphone signals o Interaural Transfer Function
• Reduction of bandwidth using distributed MWF
Problem statement
Binaural processing
Bandwidth reduction
Conclusions