• No results found

Noise reduction and binaural Noise reduction and binaural cue preservation of multi- cue preservation of multi- microphone algorithms microphone algorithms

N/A
N/A
Protected

Academic year: 2021

Share "Noise reduction and binaural Noise reduction and binaural cue preservation of multi- cue preservation of multi- microphone algorithms microphone algorithms"

Copied!
32
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Noise reduction and binaural Noise reduction and binaural

cue preservation of multi- cue preservation of multi-

microphone algorithms microphone algorithms

Simon Doclo, Tim van den Bogaert, Marc Moonen, Jan Wouters

Dept. of Electrical Engineering (ESAT-SCD), KU Leuven, Belgium Dept. of Neurosciences (ExpORL), KU Leuven, Belgium

Oldenburg, June 28 2007

(2)

2

Overview Overview

• Problem statement

o Improve speech intelligibility + preserve spatial awareness o Bilateral vs. binaural processing

• Binaural signal processing using multi-channel Wiener filter

o MWF: noise reduction and preservation of speech cues, noise cues are distorted

o Extension of MWF to preserve binaural cues of all components:

– MWFv: partial estimation of noise component

– MWF-ITF: extension with Interaural Transfer Function

o Physical and perceptual evaluation

• Reduce bandwidth requirements of wireless link

o Distributed binaural MWF

(3)

33

Problem statement Problem statement

• Many hearing impaired are fitted with a hearing aid at both ears

o Signal processing to selectively enhance useful speech signal and improve speech intelligibility

o Signal processing to preserve directional hearing and spatial

awareness

o Multiple microphone available: spectral + spatial processing

• Binaural auditory cues:

o Interaural Time Difference (ITD) – Interaural Level Difference (ILD) o Binaural cues, in addition to spectral and temporal cues, play an

important role in binaural noise reduction and sound localisation o ITD: f < 1500Hz, ILD: f > 2000Hz

IPD/ITD

ILD

Problem statement -bilateral/binaural

Binaural processing

Bandwidth reduction

Conclusions

(4)

4

Bilateral vs. Binaural Bilateral vs. Binaural

Bilateral system

Independent left/right processing:

Preservation of binaural cues for localisation ?

Binaural system

More microphones:

 better performance ?

 preservation of binaural cues ? Need of binaural link

Problem statement -bilateral/binaural

Binaural processing

Bandwidth reduction

Conclusions

(5)

55

• Bilateral system:

o Independent processing of left and right hearing aid o Localisation cues are distorted

RMS error per loudspeaker when accumulating all responses of the different test conditions (NH = normal hearing, NO = hearing impaired without hearing aids, O = omnidirectional configuration, A = adaptive directional configuration)

[Van den Bogaert et al., 2006]

Bilateral vs. Binaural Bilateral vs. Binaural

Problem statement -bilateral/binaural

Binaural processing

Bandwidth reduction

Conclusions

 also effect on intelligibility through binaural hearing advantage

(6)

7

• Bilateral system:

o Independent processing of left and right hearing aid o Localisation cues are distorted

• Binaural system:

o Cooperation between left and right hearing aid (e.g. wireless link) o Assumption : all microphone signals are available at the same time

Objectives/requirements for binaural algorithm:

1. SNR improvement: noise reduction, limit speech distortion

2. Preservation of binaural cues (speech/noise) to exploit binaural hearing advantage

3. No assumption about position of speech source and microphones

[Van den Bogaert et al., 2006]

Bilateral vs. Binaural Bilateral vs. Binaural

Problem statement -bilateral/binaural

Binaural processing

Bandwidth reduction

Conclusions

(7)

1010

Configuration and signals Configuration and signals

• Configuration: microphone array with

M microphones at left and right hearing aid, communication between hearing aids

noise component

0,m

( ) =

0,m

( ) V

0,m

( ) , = 0

0

1 Y

0,m

( ) =  X

0,m

(  ) 

0,m

( )  , m = 0  M

0

 1 YX   VmM

speech component

0

( ) =

0H

( ) ( ),

1

( ) =

1H

( ) ( ) ZWYZWY

• Use all microphone signals to compute output signal at both ears

Problem statement

Binaural processing -MWF

-Cue preservation -Physical evaluation -Perceptual eval

Bandwidth reduction

Conclusions

(8)

11

Overview of cost functions Overview of cost functions

Multi-channel Wiener filter (MWF): MMSE estimate of speech component in microphone signal at both ears trade-off noise reduction

and speech distortion

Speech-distortion weighted multi-channel Wiener filter (SDW-MWF)

[Doclo 2002, Spriet 2004]

binaural cue preservation of speech + noise

Partial estimation of noise component (MWFv)

[Klasen 2005]

Extension with ITD-ILD or Interaural Transfer

Function (ITF)

[Doclo 2005, Klasen 2006, Van den Bogaert 2007]

Problem statement

Binaural processing -MWF

-Cue preservation -Physical evaluation -Perceptual eval

Bandwidth reduction

Conclusions

(9)

1212

• Binaural SDW-MWF: estimate of speech component in microphone signal at both ears (usually front microphone) + trade-off between noise reduction and speech distortion

Binaural multi-channel Wiener filter Binaural multi-channel Wiener filter

0

1

=

x v M

, =

x

,

x y v

M x v x

    

 

    

   

R R 0 r

R r R R R

0 R R r

0

1

2 2

0, 0 0

1

1, 1

( )

H H

r

H H

r

J E X

X

      

 

                   W X W V

W W X W V W

SDW

= R r

1

speech component

in front microphonesspeech distortion noise reduction

Problem statement

Binaural processing -MWF

-Cue preservation -Physical evaluation -Perceptual eval

Bandwidth reduction

Conclusions

estimate o Depends on second-order statistics of speech and noise

o Estimate Ry during speech-dominated time-frequency segments, estimate Rv during noise-dominated segments, requiring robust voice activity

detection (VAD) mechanism

o No assumptions about positions of microphones and sources

o Adaptive (LMS-based) algorithm available [Spriet 2004, Doclo 2007]

(10)

13

Binaural multi-channel Wiener filter Binaural multi-channel Wiener filter

• Interpretation for single speech source:

o Spectral and spatial filtering operation

with  (spatial) coherence matrix and P (spectral) power

o Equivalent to superdirective beamformer (diffuse noise field) or delay-and-sum beamformer (spatially white noise field)

+

single-channel WF-based postfilter (spectral subraction)

Spatial separation between

speech and noise sources SNR

0

1 1

*

,0 1 1

/

0,

H

v v

SDW H H r

v v v s

P P A

 

Γ A A Γ A W AΓ A A Γ A

• Binaural cues (ITD-ILD) :

Perfectly preserves binaural cues of speech component Binaural cues of noise component  speech component !!

Problem statement

Binaural processing -MWF

-Cue preservation -Physical evaluation -Perceptual eval

Bandwidth reduction

Conclusions

(11)

1414

• Partial estimation of noise component

o Estimate of sum of speech component and scaled noise component

o Relationship with SDW-MWF: mix with reference microphone signals

reduction of noise reduction performance works for multiple noise sources

Partial noise estimation (MWFv) Partial noise estimation (MWFv)

0

1

0

1

0,

2

0, 0

1, 1, 1

( )

r H

r

H r

r

X X

V

J EV

    

 

    

  

   

  

W W Y

W Y

0

1 0

1

2 2

0, 0 0

1,

0,

1 1, 1

0 1

( )

r

,

r

H H

r

H H

r

J X V

E X   V

 

      

 

       

    

     

 

 

 

W X W V

W W X W V

0

1

0 0, ,0

1 1, ,1

(1 ) (1 )

r SDW

r SDW

Z Y Z

Z Y Z

 

 

  

  

Problem statement

Binaural processing -MWF

-Cue preservation -Physical evaluation -Perceptual eval

Bandwidth reduction

Conclusions

(12)

15

Interaural Wiener filter (MWF-ITF) Interaural Wiener filter (MWF-ITF)

• Extension of SDW-MWF with binaural cues

o Add term related to binaural cues of noise (and speech) component

o Possible cues: ITD, ILD, Interaural Transfer Function (ITF)

( ) = ( ) x ( ) v ( )

tot SDW cue cue

J W J W

J W

J W

0 0

1 1

H

v v

out H

v

ITF Z

ZW V W V

 

0 1

0

1 1 1

* 0, 1,

0, 0 1

1, 1, 1,* 1 1

( , ) ( , )

r r

v r v

in

r r r v

E V V

V r r

ITFVE V VR r r

e.g.

R

   

0

1

2 2

0, 0 0

1

1, 1

2 2

0 1 0 1

( ) =

H H

r

tot H H

r

H x H H v H

in in

J E X

X

E ITF E ITF

 

      

    

        

   

 

   

W X W V

W W X W V

W X W X W V W V

ITF preservation speech ITF preservation noise

o Closed form expression!

o large  changes direction of speech  increase weight  o Implicit assumption of single noise source

Problem statement

Binaural processing -MWF

-Cue preservation -Physical evaluation -Perceptual eval

Bandwidth reduction

Conclusions

(13)

1717

Simulation setup Simulation setup

• Identification of HRTFs:

o Binaural recordings on CORTEX MK2 artificial head

o 2 omni-directional microphones on each hearing aid (d=1cm) o LS = -90:15:90, 90:30:270, 1m from head

o Room reverberation: T

60

=140 ms (and T

60

=510 ms)

Problem statement

Binaural processing -MWF

-Cue preservation -Physical evaluation -Perceptual eval

Bandwidth reduction

Conclusions

(14)

18

Experimental results Experimental results

• Simulations:

o S

x

N

y

, SNR = 0 dB on left front microphone (broadband) o f

s

= 20.48 kHz

• MWF algorithmic parameters:

o batch procedure, perfect VAD o L=96, =5

o MWFv for different , MWF-ITF for different ,

• Physical evaluation:

o Speech = HINT, noise = babble noise o Speech intelligibility: SNR

o Localisation: ITD / ILD

• Perceptual evaluation:

o Preliminary study with NH subjects o Speech intelligibility: SRT

o Localisation: localise S and N

Problem statement

Binaural processing -MWF

-Cue preservation -Physical evaluation -Perceptual eval

Bandwidth reduction

Conclusions

(15)

1919

Physical evaluation Physical evaluation

• Performance measures:

o Intelligibility weighted SNR improvement (left/right)

o ILD error (speech/noise component)  power ratio

   

x x

x out i in i

i

ILD ILDILD

   

o ITD error (speech/noise component)  phase of cross-correlation

x

 

i x

 

i i

ITD IITD

     

1

* *

0,0 1, 0 1

{ } { }

x i r r x x

ITDE X X E Z Z

   

   

L i L i

i

SNR ISNR

   

importance of i-th frequency for speech intelligibility

low-pass filter 1500 Hz

Problem statement

Binaural processing -MWF

-Cue preservation -Physical evaluation -Perceptual eval

Bandwidth reduction

Conclusions

(16)

0 0.2 0.4 0.6 0.8 1 0

10 20

SNR left [dB]

0 0.2 0.4 0.6 0.8 1

0 10 20

SNR right [dB]

0 0.2 0.4 0.6 0.8 1

0 2 4 6

ILD speech [dB]

0 0.2 0.4 0.6 0.8 1

0 2 4 6

ILD noise [dB]

0 0.2 0.4 0.6 0.8 1

0 0.5 1

ITD speech [rad]

auditec 60deg (=5, L=96, N=4)

0 0.2 0.4 0.6 0.8 1

0 0.5 1

ITD noise [rad]

Physical evaluation: MWFv Physical evaluation: MWFv

S

0

N

60

(17)

2323

• Procedure:

o headphone experiments, using measured HRTFs

o Filters are calculated off-line on VU speech-weighted noise as S and multitalker babble noise as N

o All stimuli presented at comfort level, 5 NH subjects (ongoing)

Perceptual evaluation Perceptual evaluation

Headphones HRTFx

HRTFv speech

noise

G Binaural

filter

Mic L

R

Problem statement

Binaural processing -MWF

-Cue preservation -Physical evaluation -Perceptual eval

Bandwidth reduction

Conclusions

• Speech intelligibility:

o Adaptive procedure to find 50% Speech Reception Threshold (SRT)

• Localisation:

o S and N components (telephone) are sent separately through filter o Localise S and N in room where HRTFs were measured

o Level roving 6 dB, 3 repetitions per condition for each subject

(18)

N270 N315 S0 S45 N60 S90 24

-90 -75 -60 -45 -30 -15 0 15 30 45 60 75 90

v a mwf02 0dB

Perceptual evaluation: MWFv Perceptual evaluation: MWFv

• Algorithms: unprocessed, state-of-the art bilateral, MWF, MWFv (=0.2)

• Conditions: S0

N

60

, S

45

N

315

and S

90

N

270

(T

60=510 ms)

N270 N315 S0 S45 N60 S90 -90

-75 -60 -45 -30 -15 0 15 30 45 60 75 90

v a unproc 0dB

N270 N315 S0 S45 N60 S90 -90

-75 -60 -45 -30 -15 0 15 30 45 60 75 90

v a classic 0dB

N270 N315 S0 S45 N60 S90 -90

-75 -60 -45 -30 -15 0 15 30 45 60 75 90

v a mwf0 0dB

Problem statement

Binaural processing -MWF

-Cue preservation -Physical evaluation -Perceptual eval

Bandwidth reduction

Conclusions

(19)

2525

Perceptual evaluation: MWFv Perceptual evaluation: MWFv

• With state-of-the-art systems: preservation of binaural cues only within central angle of frontal hemisphere.

• Binaural MWF:

o preserves localization cues for speech source

o preserves localization cues for noise source(s) with small mixing 

o Recent SRT experiments (N=2) show no substantial SRT difference between =0 and =0.2

• Ongoing research:

o Perceptual evaluation (SRT and localisation) for MWF-ITF

Problem statement

Binaural processing -MWF

-Cue preservation -Physical evaluation -Perceptual eval

Bandwidth reduction

Conclusions

(20)

28

Bandwidth constraints Bandwidth constraints

• Binaural MWF:

o 2M microphone signals are transmitted over wireless link

• Reduce bandwidth requirement of wireless link:

o Transmit one signal from contralateral ear

Problem statement

Binaural processing

Bandwidth reduction

Conclusions

– Front contralateral microphone signal

– Output of contralateral fixed (e.g. superdirective) beamformer – Output of MWF using only M contralateral microphone signals – Iterative distributed binaural MWF scheme

(21)

2929

Physical evaluation Physical evaluation

60 90 120 180 270 300 -60 60 -120 120 120 210 60 120 180 210 60 120 180 270 8

10 12 14 16 18 20

22 Performance comparison of MWF-based binaural algorithms

noise source(s) angle (°)

AI weighted SNR improvement (dB)

MWF-full MWF-front MWF-contra MWF-iter

Problem statement

Binaural processing

Bandwidth reduction

Conclusions

Performance of dB-MWF close to full binaural MWF !

(22)

3030

Contralateral directivity pattern Contralateral directivity pattern

T60=140 ms S0N120

Left HA -50

-45 -40 -35

30 210

60 240

90 270

120

300

150

330

180 0

Left HA - contralateral (N=4,120 deg,=5,SNR=14.6277)

Fullband

SNR=14.6dB B-MWF

-45 -40 -35 -30

30 210

60 240

90 270

120

300

150

330

180 0

Left HA - front contralateral (N=3)

Fullband

MWF-front

SNR=10.5dB

-55 -50 -45 -40 -35

30 210

60 240

270

120

300

150

330

180 0

Left HA - MWF contralateral (N=4,120 deg,=5,SNR=14.2051)

Fullband

MWF-contra

SNR=14.2dB

-50 -45 -40 -35

30 210

60 240

270

120

300

150

330

180 0

dB-MWF

SNR=14.2dB

(23)

3131

Conclusions Conclusions

• State-of-the art signal processing in (bilateral) HAs:

preservation of binaural cues only within central angle of frontal hemisphere

• Binaural MWF:

o Substantial noise reduction (MWF 4  3 > 2) o Preservation of binaural speech cues

o Distortion of binaural noise cues

o No assumptions about positions and microphones  VAD

• Compromise between noise reduction and binaural cue preservation can be achieved with extensions of MWF

o Mixing with microphone signals o Interaural Transfer Function

• Reduction of bandwidth using distributed MWF

Problem statement

Binaural processing

Bandwidth reduction

Conclusions

Referenties

GERELATEERDE DOCUMENTEN

• Spatial directivity patterns for non-robust and robust beamformer in case of no position errors and small position errors: [0.002 –0.002 0.002] m. Design, implementation,

• Spatial pre-processor and adaptive stage rely on assumptions (e.g. no microphone mismatch, no reverberation,…). • In practice, these assumptions are often

o Multi-channel Wiener filter (but also e.g. Transfer Function GSC) speech cues are preserved noise cues may be distorted. • Preservation of binaural

– Binaural cues, in addition to spectral and temporal cues, play an important role in binaural noise reduction and sound localization. (important to preserve

• Combination of a-priori knowledge and on-line estimation of both speech and noise terms anticipated to enhance robustness.

BINAURAL MULTI-CHANNEL WIENER FILTERING The multi-channel Wiener filter (MWF) produces a minimum mean- square error (MMSE) estimate of the speech component in one of the

Adaptive beamforming techniques typically solve a linearly constrained minimum variance (LCMV) optimization criterion, minimizing the output power subject to the (hard) constraint

Suboptimal techniques either using the front contralateral micro- phone signal or the output of a monaural MWF are presented, to- gether with an iterative distributed MWF scheme