• No results found

Improved Signal Processing in Hearing Aids: A system Approach

N/A
N/A
Protected

Academic year: 2021

Share "Improved Signal Processing in Hearing Aids: A system Approach"

Copied!
33
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Improved Signal Processing in Hearing Aids:

A system Approach

Kim Ngo

ESAT-SCD, Katholieke Universiteit Leuven, Belgium

EST-SIGNAL Meeting September 2009.

(2)

Outline

Introduction (Hearing aids, hearing loss, acoustic feedback, background noise).Problem statement and motivation.

◮ Multi-Channel Wiener Filter (MWF) based Noise Reduction (NR). ◮ SDW-MWF based aprroach to integrate NR and DRC.

Adaptive Feedback Cancellation (AFC).Conclusion.

◮ Publications. ◮ Timeline

(3)

Introduction

Research areas in Hearing Aids

Binaural Processing Wireless Communications

AD/DA Converters Digital Signal Processing

Analog Signal Processing Automatic Sound Classification

Single−channel Noise Reduction Directional Microphones Loudspeaker Speech/Audio Coding Source Localisation Beamforming Source Separation Active Noise Control Filterbank Design Dereverberation Automatic Speech Recognition

Feedback Cancellation

Dynamic Range Compression

Multi−channel Noise Reduction

(4)

Introduction

Sensorineural hearing loss

0000000000000000000000000000000 1111111111111111111111111111111 0 10 20 30 40 50 60 70 80 90 100 0 750 125 250 500 1000 2000 3000 4000 5000 6000 7000 8000 Hearing level [dB] Normal Hearing Frequency [Hz] Profound Hearing loss

Mild Hearing loss

Moderate Hearing loss

Severe Hearing loss

Dynamic Range

Uncomfortable Loud Soft Threshold Sensorineural Normal d d’ a b c a’ b’ c’

Hearing threhold increases with increasing frequency.Threshold of hearing raised as a result of the hearing loss. ◮ Threshold of loudness discomfort remains the same. ◮ Reduced dynamic range (threshold to discomfort).

(5)

Introduction

Dynamic Range Compression

◮ Audibility is an important first step in improving the intelligibility of a speech signal.

0 20 40 60 80 100 120 20 30 40 50 60 70 80 90 100 110 120 output SPL (dB) input SPL (dB) Attack time Output Spectrum Gain adjustment Power (dB)

Critical Band Gain Model

CR/CT Release time

Input spectrum

Automatically adjust the gain based on the intensity level. ◮ High intensity attenuated - low intensity amplified.

Compression Threshold (CT) (point at which the slope change)Compression Ratio (CR) (Steepness of the slope)

Amplification gain G dB Objective:

◮ Mapping the wide dynamic range of speech into the reduced dynamic range. ◮ Weak sounds audible - loud sounds not uncomfortably loud.

(6)

Introduction

Acoustic Feedback

path forward Microphone signal Loudspeaker signal Feedback signal acoustic feedback path Near−end signal F G

Undesired acoustic coupling between loudspeaker and microphone. ◮ Limits the maximum amplification.

Feedback are most severe at high frequencies. ◮ Instability results in a high frequency tone. ◮ Correlation between near-end signal and

loudspeaker signal.

Standard adaptive filtering converge to a biased solution.

Objective:

◮ Increase maximum stable gain (MSG) ◮ Reduce bias and convergence (misadjustment)Minimize speech distortion (sound quality)

(7)

Introduction

Background Noise

◮ Reduced frequency resolution (separating sounds of different frequencies) ◮ Reduced temporal resolution (intense sounds mask weaker sounds)

Hearing aid users

Understanding speech in noise is a major problem

multiple speakers, fans, traffic etc.Reduces the intelligibility of speech. ◮ More sensitive to the noise level. ◮ Need higher SNR to communicate.

Objective:

Maximally reduce the noise (SNR improvement) ◮ Minimize speech distortion (sound quality) ◮ Improve intelligibility of speech

(8)

Problem Statement and Motivation

◮ Compensation of sensorineural hearing loss requires NR, DRC and AFC. ◮ The general problem of NR, DRC and AFC is not new.

Each of these areas are usually developed and evaluated independently. Existing mehtod

Hearing aids typically use a serial concatenation of NR and DRC and AFCCounteract and limit functionality of other algorithms

Short-term objective:

Development on Multi-channel NR (SDW-MWF).Integration of SDW-MWF and DRC.

◮ Development on Adaptive feedback Cancellation (PEM-based AFC) ◮ Analyse any undesired effects in the integration process.

Long-term objective:

◮ Integration of NR, DRC and AFC into one signal processing scheme. ◮ SNR improvement vs. Audibility vs. speech distortion vs. Increase MSG

(9)

Speech Distortion Weighted Multi-channel Wiener

Filter (SDW-MWF)

+ ... ... Noise Noise Desired signal Noise X2(k,l) W2(k,l) XM(k,l) WM(k,l) Z(k,l) X1(k,l) W1(k,l)

Frequency-domain microphone signals,

X(k,l) =Xs(k,l) +Xn(k,l) (1) MWF MMSE criterion, W(k,l) =arg min W ε{|X s 1(k,l) −W HX (k,l)|2} (2) SDW-MWF MMSE criterion, W(k,l) =arg min W ε{|X s 1(k,l) −W HXs (k,l)|2} + µε{|WHXn(k,l)|2} (3) Optimal SDW-MWF W(k,l) =` Rs(k,l) + µRn(k,l)´−1 Rs(k,l)e1 (4)

Output of the SDW-MWF can be written as

Z(k,l) =W∗,H(k,l)X(k,l). (5)

(10)

Concept SDW-MWF

µ

Second-order statistics of the noise are assumed to be stationary

Rs(k,l) =Rx(k,l) −Rn(k,l) (6)

Estimation of Rx(k,l)and Rn(k,l)an averaging

time window of 2-3 seconds is typically used. 0 5 10 15 20 25 30

−0.2 −0.1 0 0.1 0.2 0.3 0.4 Time (sec) Update speech+noise correlation matrice Update noise−only correlation matrice Properties of SDW-MWF

SDW-MWF depends on long-term average of spectral and spatial characteristics.Eliminates short-time effects, such as musical noise

◮ SDW-parameterµis a fixed value for all frequencies Properties not included in SDW-MWF

Speech and noise can be non-stationary spectrally and temporally. ◮ speech contains many pauses while noise can be continously present. ◮ Different weight to speech dominant segments and to noise dominant segments

(11)

Speech Presence Probability

Two-state speech model

H0(k,l) :Xi(k,l) = Xin(k,l)

H1(k,l) :Xi(k,l) = Xis(k,l) +Xin(k,l) (7)

Conditional Probability Density Functions of the observed signals

p(Xi(k,l)|H0(k,l)) = 1 πλn i(k,l) exp ( −|Xi(k,l)| 2 λn i(k,l) ) p(Xi(k,l)|H1(k,l)) = 1 π(λs i(k,l) + λ n i(k,l)) exp ( − |Xi(k,l)| 2 λsi(k,l) + λn i(k,l) ) (8)

Speech Presence Probability

p(k,l) =  1+ q(k,l) 1−q(k,l)(1+ ξ(k,l))exp(−υ(k,l)) ff−1 (9)

◮ Conditional SPP is estimated for each frequency bin and each frame

(12)

Extension of SDW-MWF

µ

into SDW-MWF

SPP

Incorporating the conditional Speech Presence probability in SDW-MWF

W=arg min

W pε{|X

s

1−WHX|2|H1} + (1−p)ε{|WHXn|2} (10)

The SDW-MWF incorporating the conditional SPP can then be written as

W

SPP=

Rs+“1pRn”−1Rse1. (11)

If p=0, the SDW-MWF

SPPattenuates the noise by applying W∗←0. ◮ If p=1, the SDW-MWF

SPPsolution corresponds to the MWF solution (µ=1). ◮ If 0<p<1 there is a trade-off between noise reduction and speech distortion. The combined solution can then be written as

WSPP= „ Rs+ „ 1 α(1 µ)+(1−α)p « Rn «−1 Rse1 (12)

(13)

Extension of SDW-MWF

µ

into SDW-MWF

SPP

◮ Example of SPP for a frame

0 0.2 0.4 0.6 0.8 1 0 2 4 6 8 10 Conditional SPP Weighting factor µ=1 µ=2 µ=3 µ=4 1/p ◮ SDW-based on SPP 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 7 8 Conditional SPP Weighting factor µ=1 and α=0.5 µ=2 and α=0.5 µ=3 and α=0.5 µ=4 and α=0.5 13 / 33

(14)

Extension of SDW-MWF

µ

into SDW-MWF

SPP

Example of SPP for a frame

0 0.2 0.4 0.6 0.8 1 0 2 4 6 8 10 Conditional SPP Weighting factor µ=1 µ=2 µ=3 µ=4 1/p ◮ SDW-based on SPP 0 0.2 0.4 0.6 0.8 1 1 2 3 4 5 6 7 8 Conditional SPP Weighting factor µ=1 and α=0.5 µ=2 and α=0.5 µ=3 and α=0.5 µ=4 and α=0.5

Example of SPP for a frame (modified)

0 2000 4000 6000 8000 0 0.2 0.4 0.6 0.8 1 1.2 Frequency (Hz) Conditional SPP

ζmin=0.1 and ζmax=0.3162 ζmin=0.1 and ζmax=0.60

SDW-based on SPP (modified) 0 2000 4000 6000 8000 0 0.2 0.4 0.6 0.8 1 1.2 Frequency (Hz) Conditional SPP α=0 α=1, µ=2 α=0.25, µ=2 α=0.50, µ=2 α=0.75, µ=2

(15)

Extension of SDW-MWF

µ

into SDW-MWF

SPP A priori SNR FFT IFFT (Synthesis) Input signal Output signal Filtering (Analysis) & A priori SAP A posteriori SNR correlation matrices Conditional SPP SPP−SDW−MWF Frequency domain Additional signal

processing path Existing signal processing path

ˆ q= P(H0) p= P(H1|X) Z= W ∗,H SPPX WSPP

Challenges on additional signal processing path ◮ Incorporating a psychoacoustical ModelMasking properties of the human auditory

system

Auditory properties of speech perception ◮ Define perceptual relevant criteria ◮ Make residual noise perceptually inaudible Challenges on existing signal processing path

◮ Limited opportunities

Continously updating correlation matrices in speech+noise periods.

development on Voice Activity Detection

(16)

Experimental Set-up for SDW-MWF

µ

and

SDW-MWF

SPP

Simulations have been performed with

A 2-microphone behind-the-ear hearing aid mounted on a CORTEX MK2 manikin. ◮ The loudspeakers (FOSTEX 6301B) are positioned at 1 meter from the center of

the head.

The reverberation time T

60=0.21s.

The speech is located at 0and the two multi-talker babble noise sources are

located at 120◦and 180.

The speech signals consist of male sentences from the HINT-databaseand the noise signal consist of a multi-talker babble from AuditecThe speech signals are sampled at 16kHz

(17)

Experimental results for SDW-MWF

µ

and

SDW-MWF

SPP 0 5 10 15 20 25 0 2 4 6 8 10 12 14 16 input SNR (dB) ∆ SNR intellig (dB) α=0 µ=1 and α=1 µ=1 and α=0.75 µ=1 and α=0.50 µ=1 and α=0.25 0 5 10 15 20 25 0 1 2 3 4 5 6 7 8 input SNR (dB) SD (dB) α=0 µ=1 and α=1 µ=1 and α=0.75 µ=1 and α=0.50 µ=1 and α=0.25 17 / 33

(18)

Integration of MWF based NR and DRC

Motivation

◮ When NR and DRC are serially concatenated, undesired interaction effects occur ◮ DRC can counteract NR by amplifying the residual noise after NR

Degrades the SNR and defeats the purpose of using NR

....

..

Speech

DRC

Z

s

(k, l)

Gs dB SDW-MWFµ

W

µ

(k, l)

ˆ

Z

s

(k, l)

◮ DRC does not distinction between speech and noise dominant segments ◮ Low intensity segments are amplified equally (including residual noise)

Z(k,l) = W∗,H(k,l)X(k,l) (13)

Z(k,l) = Xˆs

(19)

Extension of DRC into Dual-DRC

0 20 40 60 80 100 120 0 20 40 60 80 100 120 input SPL (dB) output SPL (dB) Speech DRC Noise DRC

◮ Reusing the conditional SPP estimated in SDW-MWFSPP ◮ A dual-DRC concept can be

introduced

Using switchable compression characteristic

Dual-DRC conept

If(p(k,l) =1)the speech DRC applied.

◮ If(p(k,l) =0)it is undesirable to amplify and the noise DRC is applied. ◮ For the in-between cases a weighted sum of the two DRC curves is used

.... .. Conditional SPP estimation Dual−DRC and SDW-MWFSPP p(k, l) Gn dual,dB Gs dB Zs(k, l) Zˆs(k, l) WSPP(k, l) 19 / 33

(20)

Experimental Set-up for SDW-MWF based NR and

DRC

Simulations have been performed

A 2-microphone behind-the-ear hearing aid mounted on a CORTEX MK2 manikin.The loudspeakers (FOSTEX 6301B) are positioned at 1 meter from the center of

the head.

The reverberation time T

60=0.21s.

◮ The speech is located at 0◦and the two multi-talker babble noise sources are located at 120◦and 180.

The speech signals consist of male sentences from the HINT-databaseThe noise signals consist of a multi-talker babble from AuditecThe signals are sampled at 16kHz.

◮ an FFT length of 128 with half overlapping frames. ◮ The DRC is implemented based on 20 critical bands. The following parameters are fixed during all simulations:

◮ The input level is set to 65dB SPL at the hearing aid microphones. ◮ The DRC attack and release time are set to at=10ms and rt=150ms.The compression threshold is set to CT=30dB.

The hearing aid gain is set to G dB=30dB.

(21)

Experimental results for SDW-MWF

µ

-based NR and

DRC

0 5 10 15 20 25 −15 −10 −5 0 5 10 input SNR ∆ SNR intellig (dB) 1/p+DRC µ=1+DRC µ=2+DRC µ=3+DRC 0 5 10 15 20 25 5.5 6 6.5 7 7.5 8 input SNR SD (dB) 1/p+DRC µ=1+DRC µ=2+DRC µ=3+DRC 21 / 33

(22)

Experimental results for SDW-MWF

SPP

-based NR

and dual-DRC

0 5 10 15 20 25 −10 −5 0 5 10 15 input SNR ∆ SNR intellig (dB) 1/p+DRC ∆GdB=5 ∆GdB=10 ∆GdB=15 0 5 10 15 20 25 4 5 6 7 8 9 10 11 12 13 14 input SNR SD (dB) 1/p+DRC ∆GdB=5 ∆GdB=10 ∆GdB=15

(23)

Combined NR and DRC and AFC

Long-term objective:

Integration of NR, DRC and AFC into one signal processing scheme.

.... .. Conditional SPP estimation Dual−DRC and

SDW-MWF

SPP

G

n dual,dB

G

s dB

Z

s

(k, l)

W

SPP

(k, l)

ˆ

Z

s

(k, l)

p(k, l) 23 / 33

(24)

Adaptive Feedback Cancellation

− feedback cancellation path acoustic feedback path + forward path + F v(t) G u(t) y(t) d[t, ˆf(t)] x(t) ˆ y[t|ˆf(t)] ˆ F

The microphone signal

y(t) =v(t) +x(t) =v(t) +F(q,t)u(t) (15) The feedback-compensated

d(t) =v(t) + [F(q,t) − ˆF(q,t)]u(t). (16) ◮ Adaptively model the feedback path

◮ Estimate the feedback signal

correlation between the near-end signal and the loudspeaker signalCaused by the closed signal loop

Main Challenge

◮ Reduce the correlation between the near-end signal and the loudspeaker signal ◮ Prediction error method-based AFC (PEM-based AFC)

(25)

PEM-based AFC (single near-end signal model)

+− acoustic feedback path +− prefilter feedback cancellation path path + forward

source signal model decorrelating F G v(t) e(t) ˆ F ˜ y[t, ˆh(t)] ˆ F H u(t) d[t, ˆf(t)] ˜ u[t, ˆh(t)] ε[t, ˆh(t), ˆf(t − 1)] ˆ y[t|ˆf(t)] y(t) x(t) ˆf(t) ˆ H−1 ˆ H−1 Microphone signal ◮ y(t) =F(q,t)u(t) +H(q,t)e(t)

Prefiltering of loudspeaker and microphone signals.

◮ Inverse near-end signal model.

The all-pole model can be written as

H(q,t) = 1 C(q,t)= 1 1+c1(t)q−1+ ... +cnc(t)q−nc (17) Prediction error ε[t, ξ(t)] =H−1(q,t)[y(t) −F(q,t)u(t)] (18)

Minimize prediction error min ξ(t)= 1 2N t X k=1 ε2[k, ξ(t)] (19)

◮ Single all-pole model (Short-term predictor) fails to remove the periodicity

(26)

PEM-based AFC (cascaded near-end signal model)

A cascade of near-end signal models removes the coloring and periodicity Sinusoidal model d(t) = P X n=1 Ancos(ωnt+ φn) +r(t), t=1, ...,M (20)

Cascaded near-end signal model

y(t) =F(q,t)u(t) +H1(q,t)H2(q,t)e(t) (21)

The CPZLP model can be written as

d(t) = P Y n=1 1−2ρcosωnz−1+ ρ2z−2 1−2 cosωnz−1+z−2 ! e(t) (22)

the output from the prediction error filter

e(t, ω) = P Y n=1 1−2 cosωnz−1+z−2 1−2ρcosωnz−1+ ρ2z−2 ! d(t) (23)

(27)

Incorporating pitch estimation in PEM-based AFC

Speech signals are usually considered as voiced or unvoicedVoiced sounds consist of fundamental frequencyω

0and its harmonic components.

CPZLP estimates all frequencies independently ◮ Does not exploits the harmonicity of speech. Fundamental frequency estimation (pitch estimation)

Sinusoids are having frequencies that areω

0, i.e.,ωn= ω0n

◮ This follows naturally from voiced speech being quasi-periodic. Applying pitch estimation in PEM-based AFC

e(t, ω) = P Y n=1 1−2 cosωnz−1+z−2 1−2ρcosωnz−1+ ρ2z−2 ! d(t) (24)

Pitch estimation considered are

Subspace-orthogonality-based pitch estimation.Subspace-shiftinvariance-based pitch estimation ◮ Optimal-filtering based pitch estimation.

(28)

Experimental Set-up for PEM-based AFC

Simulations have been performed

The near-end sinusoidal model order is set to P = 15The near-end noise model order is set to 30.

◮ Both near-end signal models are estimated using 50%overlapping data windows of length M = 320 samples.

The NLMS adaptive filter length is set to n F= 200. ◮ The near-end signal is a 30 s speech signal at fs= 16 kHz.

The forward path gain K(t)is set 3 dB below the maximum stable gain (MSG) without feedback cancellation.

(29)

Experimental results for PEM-based AFC

0 5 10 15 20 25 30 10 12 14 16 18 20 22 24 26 28 t (s) M S G (d B ) 20 log10K(t) MSG F (q) AFC-LP AFC-CPZLP AFC-shiftinv AFC-orth AFC-optfilt 0 1 2 3 4 5 x 105 −15 −10 −5 0 t/Ts(samples) M A F (d B ) AFC-LP AFC-CPZLP AFC-shiftinv AFC-orth AFC-optfilt 29 / 33

(30)

Conclusion

SDW-MWF

µcan be further improved by using SDW-MWFSPP. ◮ SDW-MWF

SPPcan be further extended with more advance SDW criteria ◮ An MWF-based NR and Dynamic Range Compression have been proposed. ◮ Dual-DRC concept using switchable compression characteristic

(31)

Future and current work

Combined MWF and AFC

+ ... ... DRC + − + + + . . . X2(k, l) WM(k, l) X1(k, l) W1(k, l) W2(k, l) u(k, l) Z(k, l) Z(k, l)ˆ XM(k, l) WF(k, l) F1(q) F2(q) FM(q) + ... DRC + . + + . . − − − ... WM(k, l) W1(k, l) W2(k, l) u(k, l) Z(k, l) WF M(k, l) X1(k, l) X2(k, l) XM(k, l) WF 1(k, l) WF 2(k, l)

Applying NR before AFC...Applying AFC before NR ...

(32)

Future and current work

Combined Single-channel NR and AFC

+− feedback cancellation path acoustic feedback path + NR forward path F G u(t) x(t) ˆ y[t|ˆf(t)] ˆ F n(t) v(t) ˆ d[t, ˆf(t)] y(t)¯ y(t) +− feedback cancellation path acoustic feedback path + forward path NR F G u(t) ˆ y[t|ˆf(t)] ˆ F d[t, ˆf(t)] y(t) n(t)v(t) x(t) ˆ d[t, ˆf(t)]

◮ Applying NR before AFC... ◮ Applying AFC before NR ...

(33)

Future and current work

further exploit PEM-based AFC using pitch estimation

◮ Using pitch estimation amplitudes can be estimated

Possibility to design more accurat CPZLP model (better decorrelation) Remove the CPZLP model

◮ by re-using the optimal filter for both fundamental frequency estimation and filtering.

◮ Decorrelation by sinusoidal subtraction

Further investigate the use of subspace methods.

Combined Speech Coding and AFC

− feedback cancellation path acoustic feedback path + + decorrelation device forward path F u(t) x(t) ˆ y[t|ˆf(t)] ˆ F G d[t, ˆf(t)] y(t) v(t) − feedback cancellation path acoustic feedback path + + forward path coding Speech F u(t) x(t) ˆ y[t|ˆf(t)] ˆ F G d[t, ˆf(t)] y(t) v(t) 33 / 33

Referenties

GERELATEERDE DOCUMENTEN

The enthalpy of dehydration of the hydrates cannot be determined as isosteric heat of adsorption from dehydration or rehydration temperatures as a function of the

Op grond van de veronderstelling, dat degenen die door de participanten aan de vorming van het algemeen beleid belangrijk en invloedrijk worden geacht, dikwijls

In the thesis, the ways in which African migrant youth navigated both their schooling spaces and their lives outside of school revealed much about current conditions

Voor de constructie van een hoek van 108 0 = 72 0 + 36 0 kun je in de algemene constructies die bij deze uitwerkingen bijgeleverd zijn nadere informatie vinden.. De

In general, the multi-microphone noise reduction approaches studied consist of a fixed spatial pre- processor that transforms the microphone signals to speech and noise

Noise power and speech distortion performance In order to analyse the impact of the weighting factor μ on the NR criterion and on the ANC criterion, the SD at the ear canal

In this paper, a multi-channel noise reduction algorithm is presented based on a Speech Distortion Weighted Multi-channel Wiener Filter (SDW-MWF) approach that incorporates a

The main purpose of this paper is to investigate whether we can correctly recover jointly sparse vectors by combining multiple sets of measurements, when the compressive