preservation using Multi-Channel Wiener Filtering and Interaural Transfer Functions

(1)

Theoretical analysis of binaural cue

preservation using Multi-Channel Wiener Filtering and Interaural Transfer Functions

S. Doclo, T.J. Klasen, T. Van den Bogaert, J. Wouters, M. Moonen

Katholieke Universiteit Leuven, Belgium - Dept. Electrical Engineering (SCD), ExpORL

1 Binaural hearing aids

• Hearing impairment → reduction of speech intelligibility in background noise – signal processing to selectively enhance/extract useful speech signal

– multiple microphones available: spectral + spatial processing

– many hearing impaired are fitted with a hearing aid at both ears

• Binaural auditory cues:

– Interaural Time Difference (ITD) - Interaural Level Difference (ILD)

– binaural cues, in addition to spectral and temporal cues, play an important role in binaural noise reduction and sound localization

• Bilateral system: independent processing → binaural cues are not preserved

• Binaural system: cooperation between left and right hearing aid

Hearing aid user

Z₁(ω) Z₀(ω)

W₀(ω) W₁(ω)

Y₀_,0(ω) · · · Y⁰,M0−1(ω) Y₁_,0(ω) · · · Y¹,M1−1(ω)

Hearing aid user

Z₁(ω) Z₀(ω)

W₀(ω) W₁(ω)

Y₀_,0(ω) · · · Y⁰,M0−1(ω) Y₁_,0(ω) · · · Y¹,M1−1(ω)

1. SNR improvement: noise reduction, limit speech distortion

2. Preservation of binaural cues to exploit binaural hearing advantage 3. No assumptions about position of speech source and microphones

2 Binaural multi-channel Wiener filter

Multi-channel Wiener filter (MWF): MMSE estimate of speech component in microphone signal at both ears

binaural cue preservation of speech + noise

noise component Partial estimation of

[Klasen 2005]

[Doclo 2002, Spriet 2004]

[Doclo 2005, Klasen 2006]

Function (ITF) Interaural Transfer

Extension with ITD-ILD or trade-off noise reduction

and speech distortion

Speech-distortion-weighted multi-channel Wiener filter (SDW-MWF)

• Configuration: microphone array at left and right hearing aid Y

_0,m

(ω) = X

_0,m

(ω) + V

_0,m

(ω), m = 0 . . . M

₀

− 1

Y (ω) = ^h Y

_0,0

(ω) . . . Y

_0,M0−1

(ω) Y

_1,0

(ω) . . . Y

_1,M1−1

(ω) ⁱ

^T

= X(ω) + V(ω)

• Cooperation between hearing aids: use all available microphone signals to

generate output signal at both ears → computation of filters W

₀

(ω) and W

₁

(ω) Z

₀

(ω) = W

₀^H

(ω)Y(ω), Z

₁

(ω) = W

₁^H

(ω)Y(ω), W (ω) =

"

W

₀

(ω) W

₁

(ω)

#

• SDW-MWF: estimate speech component in microphone signal at both ears;

additional trade-off between noise reduction and speech distortion J

_SDW

(W) = E







"

X

_0,r₀

− W

₀^H

X X

_1,r₁

− W

₁^H

X

#

2

+ µ

"

W

₀^H

V W

₁^H

V

#

2







⇒ W

_SDW

= R

⁻¹

r

R =

"

R

_x

+ µR

_v

0

_M

0

_M

R

_x

+ µR

_v

#

, r =

"

r

_x0

r

_x1

#

, R

_x

= R

y

− R

v

– estimate R

y

during speech-dominated segments and R

v

during noise-dominated segments → robust VAD required

– no assumptions about positions of microphones and sources

3 Theoretical analysis

• Performance measures:

– SNR improvement (left/right): difference between input and output SNR – ITD error (speech/noise): phase of cross-correlation

– ILD error (speech/noise): power ratio

• Single speech source, no assumptions about noise field:

– X = AS with A acoustic transfer function vector (head, microphones, room) W

_SDW,0

= R

⁻¹_v

A

^H

R

⁻¹_v

A +

_P^µ

s

A

^∗_0,r₀

, W

_SDW,1

= R

⁻¹_v

A

^H

R

⁻¹_v

A +

_P^µ

s

A

^∗_1,r₁

- ITD/ILD of speech component is perfectly preserved

- ITD/ILD of output noise component = ITD/ILD of speech component !

4 Extension with Interaural Transfer Function

• Control binaural cues of noise (and speech) component

• Interaural Transfer Function (ITF): incorporates both ITD and ILD – assumption: single localized noise source (constant ITF)

IT F

_des^v

= V

_0,r₀

V

_1,r₁

= E{V

_0,r0

V

_1,r^∗ ₁

}

E{V

_1,r1

V

_1,r^∗ ₁

} , IT F

_out^v

(W) = W

^H₀

V W

^H₁

V J

_{IT F}^v

(W) = E

W

^H₀

V

W

^H₁

V − IT F

_des^v

2

= E{|W

₀^H

V − IT F

_des^v

W

₁^H

V |

²

}

E{|W

₁^H

V |

²

} = W

^H

R

_vt

W W

^H

R

_v1

W

• Total cost function: noise reduction, speech distortion, cue preservation J

_tot

(W) = J

_SDW

(W) + αJ

_{IT F}^x

(W) + βJ

_{IT F}^v

(W)

– subtle difference with quadratic ITF cost function in [Klasen, ICASSP 2006]

– no-closed form expression → iterative optimization techniques

5 Simulation results

• Investigate effect of α and β on noise reduction and cue preservation

• Data model:

– one speech source + one noise source, non-reverberant environment

– head shadow effect → HRTFs (equal for microphones on same hearing aid) – sensor noise: R

v

(ω) = P

_v

(ω) ^h g (ω, θ

_v

)g

^H

(ω, θ

_v

)+δI

_M

ⁱ

• Simulation parameters:

– speech source at −5

^◦

and noise source at 40

^◦

– 2-microphone array (d

₀

= 2 cm, d

₁

= 1.5 cm)

– f = 2 kHz, f

s

= 16 kHz, SNR = 0 dB, δ = 0.01 (sensor noise −20 dB), µ = 1

• Conclusions:

– Increasing β substantially decreases ITD/ILD error of noise component, but also decreases SNR improvement

– α can be used for reducing ITD/ILD error of speech component caused by increasing β

0 2 4 6 8 10

0 1

2 3

4 5

0 0.1 0.2 0.3 0.4 0.5

α ITD error speech [%]

β

0 2 4 6 8 10

0 1

2 3

4 5

0 10 20 30 40 50

α ITD error noise [%]

β

0 2 4 6 8 10

0 1

2 3

4 5

18 20 22 24 26 28 30 32

α Average ∆SNR [dB]

β

−5 5 15

30 210

60 240

90 270

120

300

150

330

180 0

preservation using Multi-Channel Wiener Filtering and Interaural Transfer Functions

Theoretical analysis of binaural cue