• No results found

AT

N/A
N/A
Protected

Academic year: 2021

Share "AT"

Copied!
323
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

A

KasteelparkArenberg10,3001Leuven(Heverlee)

SUBBAND AND FREQUENCY{DOMAIN

ADAPTIVE FILTERING TECHNIQUES

FOR SPEECH ENHANCEMENT IN

HANDS{FREE COMMUNICATION

Promotor:

Prof.dr.ir.M.Moonen

Proefschriftvoorgedragentot

hetbehalenvanhetdoctoraat

indetoegepastewetenschappen

door

(2)

Allerechtenvoorbehouden. Nietsuitdezeuitgavemagvermenigvuldigden/of

open-baar gemaakt wordendoor middel van druk, fotocopie, micro lm, elektronisch of

op welke andere wijze ook zonder voorafgaande schriftelijke toestemming van de

uitgever.

All rightsreserved. Nopartofthepublication may bereproducedin anyformby

print, photoprint, micro lmor any other meanswithout written permission from

thepublisher.

(3)

Thetelecommunicationssectorischaracterizedbyanincreasingdemand foruser{

friendlinessandinteractivity. Thisexplainsthegrowinginterestinhands{free

com-municationsystems. Signalqualityincurrenthands{freesystemsisunsatisfactory.

Toovercomethis, advanced signalprocessingtechniques such asthe subbandand

frequency{domainadaptive lterareemployedto enhance thesignal. These

tech-niquesareknownto havecomputationallyeÆcientsolutions. Furthermore,thanks

to the frequency{dependent processing and adaptivity, highly time{varying

sys-temsandsignalswithacontinuouslychangingspectralcontentsuchasspeechcan

behandled.

Thisthesisdealswithsubbandandfrequency{domainadaptive lteringtechniques

for speech enhancement in hands{free communication. The text consists of four

parts. Inthe rst partdesign methods for perfect and nearly perfect

reconstruc-tion DFT modulated lter banks are discussed. Part II dealswith subband and

frequency{domainadaptive ltering. Thesubbandadaptive lterandthePBFDAF{

algorithmarediscussed. Next,theinterrelationbetweenbothapproachesisstudied

and anovel subband adaptationscheme is proposed. In partIII of thethesis an

extension tothe PBFDAF algorithmis presented, calledthe PBFDRAPadaptive

lter. Thealgorithm isanalyzedand fastimplementation schemesarederived. In

the nal partwedescribeapplications ofouralgorithms to theacousticecho

can-cellation problem. It is seen that the algorithms discussed in parts I{III can be

(4)
(5)

Mathematical Notation

v vectorv

v (z) vectorv ,function ofthez{transformvariable

M matrixM

M(z) matrixM,functionofthez{transformvariable

v,M frequency{domainequivalentsofvandM

M T

transposeofmatrixM

M 

complexconjugateofmatrixM

M H =(M  ) T

HermitiantransposeofmatrixM

M 1 inverseofmatrixM M y pseudo{inverseofmatrixM

detM determinantofmatrixM

adjM=M 1

: detM adjugateofmatrixM

diagfv g squarediagonalmatrixwithvectorvasdiagonal

M



(z) complexconjugationofthecoeÆcientsofM(z)

withoutchangingz ~ M(z)=M T  (z 1 ) paraconjugateofM(z) v(m) m{thelementofvectorv [v(z)] m

m{thelementofvectorfunction v(z)

M(m;n) elementonthem{throwandn{thcolumn of

matrixM

[M(z)]

m;n

elementonthem{throwandn{thcolumn of

matrixfunction M(z)

AB KroneckerproductofmatrixAand B

h[k] discrete{time lterortimesequenceh

H(z) z{transformofh[k]

H(f) DiscreteFourierTransformofh[k]

x?y convolutionofx[k]andy[k]

x y? circularconvolutionofx[k]andy[k]

x y circularcorrelationofx[k]andy[k]

H

l:L

(z) thel{thoutofL polyphasecomponentsofFIR

(6)

h[k] N# h[k]N{folddownsampled h[k] N" h[k]N{foldupsampled

IN setofnaturalnumbers

IN

0

=INnf0g setofnaturalnumberslargerthan0

ZZ setofintegernumbers

ZZ

0

=ZZnf0g setofintegernumbersexcept0

Q setofrationalnumbers

IR setofrealnumbers

IR

0

=IRnf0g setofrealnumbersexcept0

IR +

setofpositiverealnumbers

C setofcomplexnumbers

IR M

setofrealM{dimensionalvectors

C M

setofcomplexM{dimensionalvectors

C M

0 =C

M

nf0g setofcomplexM{dimensionalvectorsexcept0

<fxg realpartofx2C

=fxg imaginarypartofx2C

x 

complexconjugateofx

conj() complexconjugation

^

x estimateofx

bxc largestintegersmallerorequaltox2IR

dxe smallestintegerlargerorequaltox2IR

rnd(x) roundx2IRtothenearestinteger

jj absolutevalue

jjjj

2

2{norm

Efg expectationoperator

 2

x

varianceofx

gcd(M;N) greatestcommondivisorofM andN

lcm(M;N) leastcommonmultipleofM andN

xmody remainderafter divisionofx2INbyy2IN

p=a:b pisanintegerbetweena2ZZandb2ZZ,

i.e. a6p6b; p2ZZ

ab aismuch smallerthanb

ab aismuch largerthanb

ab aisapproximatelyequalto b

Fixed Symbols

M numberofsubbands,DFTsize

N subsamplingfactor

L blocksize

P lterpartitioning length

K least commonmultiple

(7)

f frequency{domainvariable

!=2f pulsation

z z{domainvariable

n blocktimeindex

f

s

sampling frequency

w[k] unknown FIRsystem,acousticpath

^ w[k],w^

(n)

[k] (equivalent)fullband adaptive lter,estimateof

w[k]

x far{end(loudspeaker) signal

s localsignalsource{of{interest

d=s+w?x near{end (microphone)signal

e errorsignal,outputoftheadaptive lter



i

i{thsubbanderrorsignal

n

rb

numberofrealsubbandstobeprocessed

n

cb

numberofcomplexsubbandstobeprocessed

 adaptive lterstepsize

R xx =Efx  x T

g autocorrelationmatrixofvectorx

L

FB

lengthofthe(equivalent)fullbandadaptive lter

L

SB

lengthofthesubbandadaptive lters

L

f

lengthofthe lterbankprototype

L a

f

lengthoftheanalysis lters

L s

f

lengthofthesynthesis lters

L

p

lengthofthesynthesispolyphase lters

L

ef

e ectivelengthoftheanalysisprototype lter

L

ac

numberofanti{causal lteringtaps

L

c

numberofextracausal lteringtaps

0 zerovectororzeromatrix

0 N N N zeromatrix 0 MN MN zeromatrix I N N N identitymatrix

J exchange matrixwithonesalongthemainanti{

diagonalandzeroselsewhere

F DFT matrix,F(m;n)=e

j 2 mn

M

; 06m;n<M

H(z) analysis polyphase matrix

G(z) synthesispolyphase matrix

B(z) prototypepolyphasematrixof aDFTmodulated

analysis lterbank

C(z) prototypepolyphasematrixof aDFTmodulated

synthesis lterbank

h

0

[k] !H

0

(z) analysis prototype lter

g

0

[k] !G

0

(z) synthesisprototype lter

f

m

[k] !F

m

(z) m{thsubbandadaptive lter

j

p

1

(8)

Acronyms and Abbreviations

A/D Analog{to{Digitalconverter

AEC AcousticEchoCancellation

ALU ArithmeticLogicUnit

ANC AdaptiveNoiseCancellation

APA AÆneProjectionAlgorithm

ASIC Application{Speci cIntegratedCircuit

BLMS Block{LMSadaptive lter

CD CompactDisk

cf. confer: comparewith

CPU CentralProcessingUnit

D/A Digital{to{Analogconverter

DCT DiscreteCosineTransform

DFT DiscreteFourierTransform

DRAM DynamicRandomAccessMemory

DSP DigitalSignalProcessor

e.g. exempli gratia: forexample

Eq. equation

ERLE EchoReturnLossEnhancement

FDAF Frequency{DomainAdaptiveFilter

FFT FastFourierTransform

FIR FiniteImpulseResponse lter

HiFi HighFidelity

IDFT InverseDiscreteFourierTransform

i.e. id est: that is

i ifandonlyif

IFFT InverseFastFourierTransform

IIR In niteImpulse Response lter

LMS LeastMeanSquareadaptive lter

MAC Multiply{Accumulate operation

MFlops Millionsof FloatingpointOperationsPerSecond

MIMO Multi{InputMulti{Outputsystem

MIPS Millionsof InstructionsPerSecond

NLMS NormalizedLeastMeanSquareadaptive lter

op. numberofequivalentrealOperations

ops. numberofequivalentrealOperationsperSecond

P/S Parallel{to{Serialconverter

PBFDAF PartitionedBlockFrequency{DomainAdaptive

Filter

PBFDRAP PartitionedBlockFrequency{DomainRAP

adaptive lter

(9)

QMF QuadratureMirrorFilters

RAP RowActionProjection

RLS RecursiveLeastSquaresadaptive lter

S/P Serial{to{Parallelconverter

SNR Signal{to{NoiseRatio

SPL SoundPressureLevel

SRAM StaticRandomAccessMemory

SVD SingularValueDecomposition

VME VERSAModuleEurocard(IEEE1014)computer

architecture

vs. versus

w.r.t. withrespectto

@ at

(10)
(11)

Voorwoord i Abstract iii Korte Inhoud v Glossary vii Contents xiii Samenvatting xxi 1 Introduction 1 1.1 Problemstatement . . . 1 1.2 Hands{freecommunication . . . 3 1.2.1 De nition . . . 3

1.2.2 Examplesofhands{freecommunicationsystems . . . 4

1.2.3 Signaldeterioration . . . 6

1.3 Characteristicsofspeech andtheacousticenvironment. . . 7

1.3.1 Speechsignals . . . 7

(12)

1.4 Enhancementtechniques . . . 10

1.4.1 Acousticechocancellation . . . 10

1.4.2 Noisesuppressionandinterferencecancellation . . . 13

1.4.3 Dereverberation . . . 14

1.5 Outlineofthethesisandcontributions . . . 15

1.5.1 Motivation . . . 15

1.5.2 Chapterbychapteroverviewandcontributions . . . 15

1.6 Conclusions . . . 20

2 BasicConcepts 21 2.1 Signalprocessingbasics . . . 22

2.1.1 Representationofvariables . . . 22

2.1.2 Multiratesignalprocessing . . . 22

2.1.3 Somede nitionsrelatedtomatrixalgebra. . . 23

2.2 Filterbankbasics. . . 24

2.2.1 Generalsubbandscheme. . . 24

2.2.2 Modulated lterbanks. . . 25

2.2.3 Polyphase implementation . . . 28

2.2.4 Perfect reconstruction . . . 30

2.2.5 Overviewof lterbankdesigntechniques . . . 30

2.3 Adaptive lteringtechniquesforspeechenhancement. . . 33

2.3.1 Standardadaptive lteringtechniques . . . 35

2.3.2 Block{basedtechniques . . . 39

2.4 Computationalcost. . . 44

(13)

I DFT Modulated Filter Bank Design for Oversampled

Subband Systems

3 PerfectReconstructionOversampledDFTModulated FilterBank

Design 47

3.1 OversampledDFTmodulatedsubbandsystems . . . 48

3.1.1 DFTmodulatedanalysis lterbank . . . 48

3.1.2 DFTmodulatedsynthesis lterbank. . . 51

3.1.3 Implementation issues . . . 55

3.2 Perfectreconstruction . . . 55

3.2.1 Smith{McMillandecompositionbasedperfect reconstruction lterbankdesign . . . 57

3.2.2 Para{unitary lterbanks . . . 60

3.3 Para{unitary lterbankdesign . . . 61

3.3.1 Imposingpara{unitarity . . . 61

3.3.2 Para{unitarylattices . . . 63

3.3.3 Optimizationofthepara{unitarylattices . . . 64

3.3.4 Adjustingtheprototype lterlength . . . 65

3.3.5 Designexamples . . . 68

3.4 Conclusions . . . 71

4 Nearly Perfect Reconstruction DFT Modulated Filter Bank De-sign 73 4.1 NearlyperfectreconstructionDFTmodulated lterbanks . . . 74

4.2 Frequency{domainoptimization. . . 75

4.3 Mixedtime/frequency{domainoptimization . . . 77

(14)

II Subband and Frequency{Domain Adaptive Filtering

5 SubbandAdaptive Filtering 89

5.1 Subbandadaptivesystems. . . 90

5.1.1 Generalsubbandadaptive lteringsetup. . . 90

5.1.2 Subbandversusfullbandadaptive ltering. . . 91

5.1.3 Filterbankselection . . . 92

5.1.4 Polyphase implementation . . . 93

5.1.5 DFTmodulatedsubbandadaptive lters . . . 93

5.2 Designcriteriaforsubbandadaptivesystems . . . 94

5.2.1 Frequencyselectivity . . . 95

5.2.2 Perfect reconstruction . . . 95

5.2.3 Perfect pathmodelling. . . 97

5.3 Downsampling andaliasing: twoextremecases . . . 98

5.3.1 Criticallydownsampledsubbandschemes . . . 98

5.3.2 Two{foldoversampledsubbandsystems . . . 98

5.4 Subbandadaptive lterlength . . . 99

5.4.1 In nite{lengthsubband lters. . . 99

5.4.2 Introducinganti{causal ltertaps . . . 104

5.5 Implementationcostand complexitygainwithrespecttoLMS . . . 110

5.5.1 Roughcostestimate . . . 110

5.5.2 Detailedcostanalysis . . . 111

5.5.3 Costevaluation . . . 112

5.6 Conclusions . . . 115

6 AnalysisofthePartitionedBlockFrequency{DomainAdaptive

(15)

6.1.1 DerivationofthePBFDAFalgorithm . . . 118

6.1.2 PBFDAFalgorithm: equationsandproperties . . . 122

6.1.3 Normalization. . . 123

6.1.4 Constrainedversusunconstrainedupdating . . . 124

6.1.5 AmbiguitycompensationforM>P+L 1 . . . 125

6.2 ThePBFDAFasaspecialcaseofsubbandadaptive ltering . . . . 127

6.3 PBFDAF:designcriteria . . . 133

6.4 Implementationcost . . . 135

6.4.1 Costcomputation . . . 135

6.4.2 Costevaluationandoptimalparametersetting . . . 136

6.5 Conclusions . . . 140

7 Fullband ErrorAdaptation Scheme 145 7.1 Fullbanderroradaptation . . . 146

7.2 Computationalcomplexity. . . 150

7.3 PBFDAFweightupdatingrevisited. . . 153

7.4 Conclusions . . . 155

III Iterated Partitioned Block Frequency{Domain Adap-tive Filtering 8 PartitionedBlockFrequency{Domain RAP 157 8.1 Partitionedblockfrequency{domainRAP . . . 158

8.1.1 De nition . . . 158

8.1.2 Mechanism . . . 159

8.2 OniteratingthePBFDRAP . . . 160

8.2.1 Computationof lim R!1 w (n;R) p . . . 161

8.2.2 UnconstrainedPBFDRAP: lim w (n;R) p . . . 165

(16)

8.2.3 ConstrainedPBFDRAP: lim R!1 w (n;R) p . . . 168 8.2.4 Summary . . . 173 8.3 Simulationexamples . . . 175 8.4 Conclusions . . . 176

9 FastPartitionedBlockFrequency{DomainRAP 179 9.1 FastPBFDRAP . . . 180

9.1.1 FastPBFDRAP,version1. . . 180

9.1.2 FastPBFDRAP,version2. . . 181

9.1.3 FastPBFDRAP,version3. . . 181

9.1.4 FastconstrainedPBFDRAP . . . 182

9.1.5 Summary . . . 183

9.2 Computationalcost. . . 188

9.2.1 UnconstrainedPBFDRAP. . . 188

9.2.2 ConstrainedPBFDRAP . . . 188

9.2.3 UnnormalizedconstrainedPBFDRAP versusPRA . . . 191

9.3 Conclusions . . . 194

IV Acoustic Echo Cancellation, Implementation and Ex-periments 10Acoustic Echo Cancellation, Implementation& Experiments 195 10.1 Robustoperationandcontrol . . . 196

10.1.1 Short{timeenergy . . . 197

10.1.2 Far{endactivitydetection . . . 198

10.1.3 Double{talkdetection . . . 199

(17)

10.3 Areal{timeimplementationofanacousticechocancelleronDSP . . 204

10.3.1 DSPequipment. . . 205

10.3.2 Software . . . 206

10.3.3 Experiments . . . 207

10.4 Conclusions . . . 214

11Conclusions and FurtherResearch 217 11.1 Conclusions . . . 217

11.2 Suggestionsforfurther research . . . 220

Bibliography 223 Appendices 241 A Somede nitionsrelatedtomatrixalgebra . . . 241

B AppendixtopartI . . . 245 B.1 Proofoftheorem3.1 . . . 245 B.2 PropertiesofB(z) . . . 246 B.3 Proofoftheorem3.2 . . . 248 B.4 Proofoftheorem3.3 . . . 250 B.5 Proofoftheorem3.4 . . . 251

B.6 Inversedecompositionofpara{unitarylattices. . . 253

B.7 Para{unitaryparameterizationforM=2N . . . 255

B.8 Para{unitaryDFTmodulated lterbanksrevisited. . . 259

C AppendixtopartII . . . 261

C.1 Proofoftheorem5.2 . . . 261

C.2 Proofoftheorem5.3 . . . 265

(18)

C.5 Proofoftheorem6.2 . . . 272

C.6 \Time{reversed"PBFDAF . . . 273

C.7 Proofoftheorem6.3 . . . 276

C.8 Proofoftheorem6.4 . . . 279

C.9 ComplexityanalysisforthePBFDAF . . . 280

C.10 Proofoftheorem7.1 . . . 283

D AppendixtopartIII . . . 287

D.1 Proofoftheorem8.1 . . . 287 D.2 Proofoftheorem8.2 . . . 288 D.3 Proofoftheorem8.3 . . . 289 D.4 Proofoftheorem8.4 . . . 290 D.5 Proofoftheorem8.5 . . . 290 D.6 ConstrainedPBFDRAP:L FB <L. . . 291 D.7 Proofoftheorem8.7 . . . 293

(19)

Introduction

Inthe rst sectionofthis introductorychapter amotivation isgiven forthe

tech-niques that will be developed in the forthcoming chapters of the thesis and we

will presentsome future perspectives on hands{freecommunication, which is the

applicationwehaveinmind.

Insection1.2 afewexamplesofhands{freecommunicationsystemsaregivenand

thedi erenttypesofsignaldegradationthatoccurareidenti ed.

Itappearsthat thecharacteristicsofspeechand thepropertiesof theacoustic

en-vironmentimposespeci cconstraintsonthetypeofsignalenhancementalgorithm

that canbeused andonthewaythealgorithmsareapplied. Hence,in section1.3

somebasicsofspeechandacousticsarediscussed.

Foreachtypeofsignaldegradationthatcanbeidenti edinthehands{free

commu-nicationsetup,anumberofenhancementtechniquesareknownfromtheliterature.

Insection1.4severalsignalenhancementalgorithmsarebrie yaddressed.

An outline and an overview of the di erent chapters and parts of the thesis will

bepresentedin section1.5. Themaincontributionsaresummarizedandreferences

willbegiventothepublicationsthatwerebroughtaboutintheframeofthiswork.

Someconclusionstothischapterareformulatedinsection1.6.

1.1 Problem statement

The telecommunications market has rapidly expanded in recent years. This has

(20)

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

0

500

1000

1500

2000

Worldwide cellular subscribers

year

millions of worldwide cellular subscribers

Figure 1.1: Numberofworldwidecellularsubscribers[39][179]

annualrevenueoftheglobal telecommunicationsmarketin 1996wasestimated at

US$645billionandisexpectedtosurpassUS$1trillionin2002[85]. Thisgrowth

ispartlydueto theexpansionofthemobilephoneindustry. Asindicatedin gure

1.1theestimatednumberofworldwidecellularsubscribersnowexceedsonebillion

and it is expected that this number will continue to increasesubstantially in the

nearfuture.

Thetelecommunicationsindustryischaracterizedbyanongoingtendencytowards

innovationandoptimization. Thisimplies,amongotherthings,afocusingtowards

user{friendliness and interactivity and hence explains the increasing demand for

hands{free communication systems today. As it is believed that more and more

telecom applications will become hands{freein the near future, a large potential

isexpectedforinnovativeandproduct{orientedresearchin the eldof hands{free

communicationin thecomingyears. Thisiscon rmedbytheobservationthatthe

globalhands{freemarketcangrowfrom US$3billiontodayto overUS$ 9billion

inthenext veyears[151].

In present{day hands{freecommunication systemsthe signal quality is often

un-satisfactory. Several types of signaldeterioration canbedistinguished, as will be

(21)

.

Figure1.2: Hands{freecommunicationsetup

Inthisthesissubbandandfrequency{domainadaptive lteringtechniquesare

stud-ied. Thesesignalprocessingalgorithmscanbeusedinawidevarietyofapplications

wheresignalenhancementisrequired. InpartI,IIandIIIofthethesisseveralsignal

processingalgorithmswillbeconsidered. InpartIVitwillbeshownthatthese

sig-nalprocessingtechniquescanbeappliedtoenhancethesignalqualityinhands{free

communicationsystems. Wewillconcentrateononeformofdegradationin

partic-ular,whichiscausedbyso{calledacousticechoes,andillustratehowthealgorithms

discussedinpartI,II andIII ofthethesiscanbeemployed.

1.2 Hands{free communication

1.2.1 De nition

Consider gure 1.2, which showsa typicalhands{free communicationsetup. The

conference room accommodates one or morecorrespondents, which interact with

otherpeople ataremotesiteviaawirelessorwired communicationchannel. The

roomshownin gure1.2iscalledthenear{endconferenceroomasitaccommodates

thelocalornear{endspeaker(s). Attheremotesitethereisasimilarroom,called

far{endconferenceroom,withthefar{endspeaker(s).

(22)

systems they are granted the freedom to walkaround and to interact with each

otherin anaturalway.

Toestablishhands{freecommunication, in each conferenceroomanumberof

mi-crophones areinstalled to recordthelocal conversation. The recordedsignalsare

thensentto theremotesitewheretheyarefed intoasetofloudspeakers.

1.2.2 Examples of hands{free communication systems

Hands{free telephony

Di erentsortsofapplications tinthehands{freecommunicationframework.Most

importantfrom aneconomic point of viewis certainly hands{freetelephony.

Re-cently in many countries all over the world mobile telephony has been forbidden

whiledriving. Mobilephonecallsincarsareallowedonlyifhands{freekitsareused.

Thisismotivatedbytheobservationthathand{heldmobilephonecallsdistractthe

driverandincreasethenumberofaccidents. Duringamobilephonecallthedriver

misses4outof 10road signsandfails to giveway toother vehicles in 25%ofthe

cases. It appearsthat theaccidentriskincreaseswith75%,whichreduces to 24%

ifahands{freekitisused [171].

ItwasfoundthatpeopleinNorthAmericaspendacombined500millionpassenger{

hoursin theirvehicleseachweek. Although65percentofallcell{phone

conversa-tions take place in a car or other form of transport, less than 15 percent of the

cell{phoneusersin theUShavehands{freeaccessories[25]. So,ahugemarketfor

hands{freekitsisexpectedin thenearfuture.

A little side{remark is however that cell{phone usage is responsible for only 1.5

percent of all accidentsin the US. On theother hand outside distraction was

re-sponsible foralmost 30percent of allcrashes. Adjusting theradio or changinga

tape orCD was thesecond{biggest cause of accidents, amounting to 11 percent.

Furthermore,itappearsthattheconversationsthemselvesleadtoadangerous

driv-ingbehavior,notthetypeof phonethat isused [25]. It shouldbeadded however

thatincontrasttotheUSmanualgearchangesarestillverypopularinEurope. It

isclearthat itisalmost impossibleto changethegear,to useamobile phoneand

tosteeranddrivesafelyatthesametime.

The mostcommon low{cost hands{freekits for mobile telephony in cars, such as

theKX{TCA87ofPanasonic(US$25),areheadsetswitha(directional)

micro-phoneand headphone. Thequalityis satisfactory,but accordingtoour de nition

of hands{freesystemsin section 1.2.1these systemsare nottruehands{free

solu-tions. Asecond classofproducts, suchas thehands{freecaradapterNTN1583of

Motorola(US$100),useahands{freemicrophoneandabuilt{inspeaker,which

are connectedtothe dashboard. These are hands{freesystems,but thequalityis

(23)

and guaranteeabettersound quality. These systemshoweverneedto bebuilt in

and are integrated in the dashboard. The most advanced products rely on echo

cancellation and noise suppression techniques. The Sonata III echo cancellation

andvoiceenhancementsystemofNMSCommunications 1

wasdevelopedforservice

providersofE1longdistanceanddigitalwirelesstechnology. Itisexpectedthat in

thenearfuture smallerandmoreadvancedsolutionsforhands{freetelephonywill

bedeveloped,whichcan beintegratedin thehand{heldmobilephonesthemselves

andprovidehighqualitywidebandspeechenhancement.

Teleconferencing

Apartfrom hands{freetelephony alsoteleconferencing tsin thehands{free

com-munication framework presented in section 1.2.1. Teleconferencing systems are

commonlyusedinbusinessmeetingstoday. Teleclassing,whichenablesstudentsto

attend classes and lecturesfrom aremoteclassroom,is a special caseof this. As

theparticipantsin ateleconferencing meeting canstay in theirlocal oÆce

unnec-essarytravelingisavoided. Hence, alarge costreductionis obtainedandthe loss

ofprecious timeiskepttoaminimum. Amarketresearchreport fromWainhouse

Researchstatesthatthemarketforaudio,videoandwebconferencingserviceswill

reachUS$9.8 billionby2006,upfrom US$2.8billionin2000[135].

Powerful teleconferencing systems are already commercially available. Polycom,

Inc.,whichacquired PictureTelCorporationin 2001,brings arange offullduplex

audioconferencingequipmenttothemarket. Thesesolutionshavealimited

band-widthandaresuitedforsmallbusinessmeetings. Largersystemsarealsoavailable,

suchastheiPower TM

900seriesofPolycom,Inc. Theyprovideintegratedaudioand

videoconferencingando erbetteraudioquality. Futuresystemswillhavetocope

withhigherbandwidthsandmulti{channel signalenhancement,forwhich eÆcient

signalprocessingalgorithmsareneeded.

Domoticand voice{controlledsystems

Nowadaysthere is anincreasing interest in so{calleddomotic systems. More and

morevoice controlled systemsare encountered in daily life at home and at work.

These hands{free systems canbe used for the automatic conditioning of a living

room orthe oÆce at work (switching the light or the central heating on and o ,

openingthecurtains,...). Otherexamplesarevoicecontrolledelectronicdevicesor

HiFisystems,theon{boardcomputerinyourcar,voicecontrolledPCsoftware,... .

Telematicsseemstobethenextbigchallengeintheautomotiveindustry,providing

cellularvoiceandinternetservicesin vehicles. InNorthAmericaalone themarket

fortelematicsequipmentis expectedtogrowto US$7billionin2007[180].

1

(24)

near−end

speaker

far−end speaker

acoustic far−end echo

Figure1.3: Full{duplexhands{freecommunicationsetup

In 2001 Ford and Vodafone announced a strategic partnership to provide in{car

telematicservices. Within veyearsnearlyallnewFordvehicleswillbe ttedwith

some telematics system. These systems will include voice recognition and text{

to{speech technology to recognizespoken phonenumbersas well asthenames of

previouslyentered contacts. Advancedsignalprocessingtechniqueswill beneeded

foradequatesignalconditioningandpreprocessing.

1.2.3 Signal deterioration

Consideragain gure1.2. Ideally,thedesirednear{endspeechsignal,whichstems

fromalocalcorrespondent,issenttotheremotesitewithoutanyqualitylosses. It

is clear that in ahands{free systemthe signalquality is degraded in many ways.

Duetothelargespeaker{to{microphonedistanceundesiredbackgroundsignalsare

recordedandaretransmittedtothecorrespondentaswell.

A rst type of disturbance are so{called acoustic echoes, which arise whenevera

far{endloudspeakersignalispickedupbythenear{end microphone(s)andissent

totheremotesite. Atthefar{endsitethesamecouplingmightexistbetween

loud-speakerandmicrophoneandhencethesignalcancirclearoundinthesystem. The

localspeakerhearsanechooradelayedversionofhis/herownspeech( gure1.3).

Such delayedsignalshinder smoothconversationandlowerthespeech

intelligibil-ity. Delayscouldbequitelong(several hundredsofmilliseconds),especiallywhen

satellitelinksareinvolved. Intheworstcasetheclosed{loopgainmightbecometoo

largeandtheechogetsunstable,resultinginaharmfulsinusoidaltone. Anumber

(25)

A second source of signal deterioration is \background noise". This type of

dis-turbance can e.g. be generated by a ventilator or a computer fan. It can also

come from people in theconference room not participating in thediscussion but

having adiscussion among themselvesin thebackground(cf. cocktailparty). In

carapplicationsnoiseisbeinggeneratedbytheengineorbythecarradio. It may

alsocomefrom thewindpassingaroundthecarcabinorfrom thecontactbetween

road and tires [94] [160]. Signal processing techniques that are applied to reduce

thebackgroundnoiselevelarereferredtoasnoisesuppressionorsourceseparation

algorithms. Ifareferenceofthedisturbingsignalcanbeobtained,e.g. in thecase

of radio orengine noise,morespeci c enhancementtechniques canbeused. This

iscalledinterferencecancellationandisverysimilartoacousticechocancellation.

Finally,remarkthat allsignalspropagatethroughtherecordingroom. Asa

conse-quence reverberationis addedto thesignals,which leadsto anothertypeof signal

distortion. Althoughsignals(especially music)maysound morepleasantwhen

re-verberationis added,ingeneraltheintelligibilityislowered. Inordertocopewith

thiskindofdeformationdereverberationordeconvolutiontechniquesarecalledfor.

1.3 Characteristics of speech and the acoustic

en-vironment

Thecharacteristicsofspeech and thepropertiesof theacousticenvironmenthave

anin uence on thetypeof algorithm that isused and onthe way thealgorithms

are applied. In this section some characteristics and peculiarities of speech and

acousticsarediscussed. Onlythosepropertiesarementionedthatareimportantfor

thealgorithmsandtechniquesconsideredinthisthesis. Moredetailedinformation

on speech and signal processing for speech signals can befound in [29] [124]. A

good referenceonacousticsis[93].

1.3.1 Speech signals

Veryoftenin hands{freeapplicationsthe signalto beenhancedis speech. Speech

is a signal with highly time{varying characteristics. Sometimes speech is quasi{

periodic(e.g. vowels),atotherinstancesitactslikecolorednoise(fricatives)oritis

impulse{like(plosives).Forexample,intheword\peace"thereisacleardi erence

betweentheplosive/p/,thevowel/i:/andthefricative/s/.

Speechisawideband signalwith frequencycomponentsbetween100and8000Hz,

hencecoveringmorethan6octaves. Forspeechunderstandingfrequenciesbetween

300and3400 Hz, i.e. 3.5octaves,areof mostinterest. Hence,asampling rateof

(26)

so{calledwideband speech systemsfor which highersampling rates, e.g. 16 kHz,

areused.

It is observedthat boththe time envelopeand thespectralcontentof speech are

continuouslychanging: theenergyofthespeechsignalisbothtime{andfrequency{

dependent. The meanfrequencyenvelopeof voiced speech isabout-6dB/octave.

Signal enhancement algorithms have to copewith the changing frequency

depen-denceandhenceoftenrelyonfrequency{domainandsubbandtechniques.

Thetime{domainevolutionofthespeechsignalischaracterizedbyitshighdynamic

range: speech pausesalternate with high energetic vowels orplosives, which

sig-ni cantlyincreasethe short{timeenergy. This cane.g. beveri ed in gure10.12

(chapter 10)were aspeech signalisshownat thetop. It isfoundthat the

ampli-tudeofspeechvariesbetween30and90dBSPL[124]. Inorderto copewiththese

amplitudevariations12to16bitslinearquantizationiscommonlyusedforspeech.

Furthermore,due to thehigh dynamic rangeofthe speech signal,signal

enhance-mentalgorithmshavetobenormalizedbytheactualsignalenergy. Inthiswaythe

algorithmcan bepreventedfrom divergingandatthesametimeslowconvergence

canbeavoided.

1.3.2 The acoustic environment

It is observed from gure 1.2 that acoustic waves travel from source to listener

and thereby propagate through the recording room. This propagation results in

asignal attenuation and spectral distortion. It appears that the attenuationand

the distortion can be modelled quite well by alinear lter. Nonlinear e ects are

typically of second order and mainly stem from the nonlinear characteristics of

theloudspeakers. Thelinear lterthat characterizestheacousticsandrelatesthe

emitted signal to the received signal, is called the acoustic impulse response and

playsanimportantroleinmanysignalenhancementtechniques.

Acoustic impulse responses can be measured quiteeasily, an exampleof which is

givenin gure1.4. Observethat theacousticimpulseresponse ischaracterizedby

adeadtime. Thedeadtimeisthetimeneededfortheacousticwavetopropagate

fromsourcetolistenerviatheshortest,directacousticpath. Afterthedirectpath

impulse a set of early re ections are encountered, whose amplitude and delay is

stronglydeterminedbytheshapeoftherecordingroomandthepositionofsource

and listener. Next come a set of late re ections, also called reverberation, which

decayexponentiallyintime. These impulsesstemfrom multi{pathpropagationas

acousticwavesre ectonwallsandobjectsintherecordingroom. Acousticimpulse

responsesaretypicallyhighly time{varying, asshownbythefollowingexperiment.

Experiment1.1 Considertheacousticimpulseresponse w

1

shown in gure1.4.

(27)

loud-0

0.05

0.1

0.15

0.2

0.25

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

time (s)

amplitude

Acoustic impulse response of the ESAT speech laboratory

Figure1.4: AcousticimpulseresponseoftheESATspeechlaboratory

speaker. Theresponsey =w

1

?x wasrecordedwithamicrophone. Thedistance

between loudspeaker and microphone was approximately 180 cm. Based on the

loudspeakerandmicrophonesignal,w

1

couldbedetermined. Thentheexperiment

wasrepeated. Thecon gurationwasslightlychanged,movingthemicrophone1cm

to theleft andleavingtheposition oftheloudspeakerandtherest ofthe

environ-mentunchanged. Againtheacousticimpulseresponse wascomputed, resultingin

w

2

. Despitethesmallchangeinmicrophonepositiontheimpulseresponsechanged

substantially: itwas foundthat

jjw 1 w 2 jj 2 jjw 1 jj 2 =72%:

Tosimulate thee ect of movingcorrespondents in the recording room adummy

was placed between loudspeaker and microphone and the impulse response (w

3 )

wascomputed. Thenthedummywasmovedapproximately1cm. Allotherobjects

were left unchanged. Againtheacousticimpulseresponsew

4 wasdetermined. In thiscase jjw 3 w 4 jj 2 jjw 3 jj 2 =34%: 5

(28)

arecalledfor. Thankstothecontinuousupdatingthesealgorithmsaremoreorless

robustagainstpossiblesystemvariations.

Tocharacterizetheamountofreverberationinarecordingroomthereverberation

time (RT

60

) is de ned as the time that the sound pressure level or the intensity

needs to decay to e.g. -60 dB of its original value. It is therefore a measure of

the decay and of theduration of theacoustic impulseresponse. It appearsto be

independentof the actualposition ofsource and listener. Thereverberationtime

was computed forthe impulse response shown in gure 1.4 following the method

describedin [60]. Itappearedthat RT

60

240ms.

Typicalreverberationtimesareintheorderofhundredsoreventhousandsof

mil-liseconds. ForatypicaloÆceroomRT

60

isbetween100and400ms,forachurch

RT

60

can be several seconds long. If therefore in a digital signal enhancement

application the acoustic impulse responses are characterized by FIR ltersmany

hundredsorseveralthousandsof ltertapsareneeded,dependingonthesampling

rate. Hence,computationallyeÆcientalgorithmsarerequired.

Inorder to reduce the lter order,i.e. the numberof delay elements, IIRmodels

couldbecalled for. Itappearsthatalthough theordercanbereducedinthis way

it still remains quite large, i.e. in the order of several hundreds[75] [108]. IIR{

basedenhancement techniqueshaveto be reliedonin that case, typically leading

toeitheranincreasedcomputationalload,orstabilityproblemsandconvergenceto

localminima[108][141].

Inorderto optimallycontrol theexperimentscarriedoutintheframeof this

the-sis simulated room impulse responses were often used. These simulated acoustic

impulseresponsesweredesignedfollowingthemethod describedin[4][129][154].

1.4 Enhancement techniques

Eachofthethreeformsofsignaldegradationthatariseinhands{freecommunication

are now discussed in more detail, emphasizingexisting algorithmic solutions that

areknownfrom theliterature.

1.4.1 Acoustic echo cancellation

Experiments have shown that suppressing the acoustic echoes with 45 dB leads

to satisfactory perceptual results, as long asthe overall delay introduced by the

echo canceller doesn't exceed a certain upper bound. The input{output delay is

(29)

far−end echo

+

far−end signal

output

near−end

speaker

local near−end

acoustic path

signal

adaptive filter

+

x d e y s ^ w w

Figure1.5: Adaptiveacousticechocancellation

with respect to echo cancellation are containedin the ITU{T recommendations 2

(G.167) on acoustic echo controllers [86]. For instance, the end{to{end delay is

recommended not to exceed 16 ms for wideband teleconferencing. The far{end

signal suppression (when no near{end signal is present) should reach 40 dB for

teleconferencing systemsand45 dBin hands{freetelephony. Inpresence ofnear{

endsignals(doubletalk)thesuppressionshouldbeatleast25dB.Convergenceto

a3dBattenuationlevelshouldlast lessthan20msin thecaseofsingletalk.

Tosuppresstheechoseveralconventionalacoustic echocancellation techniquescan

be applied [77]. For instance, highly directional loudspeakers and microphones

and sound absorbingmaterialscan be usedto avoid re ections. Another popular

technique is voice controlled switching or loss control, which mutes channels in

whichnoorverylow{energeticactivityismeasured. Itisclearthatthesetechniques

relyon accurate voice activity detectionand hence quickly degrade. Further, the

stabilitymarginoftheclosed{loopsystemcanbeimprovedusingso{calledhowling

control. Theretoalmost inaudible nonlinearoperationsare inserted in the signal

pathtoavoidinstabilityoftheclosed{loopsystem,asthiswouldresultinaharmful

sinusoidaltonecirclingaroundinthenetwork.Frequency{shifting,comb ltersand

resonantpeak removalare often used. Finally, nonlinearpost{processing devices

canbeaddedtoremoveresidualerrorsignalsandtomakethesignalmorepleasant

tolistento.

Inpracticenowadaysacousticechocancellersare basedonadaptive ltering

tech-niques[76][77][106][176]. Adaptive lterswillbediscussedinsection2.3. Ageneral

adaptiveacousticechocancellationsetupisshownin gure1.5. Iftheadaptive lter

^

wisagoodestimateoftheacousticimpulseresponsew itisobservedthat

e[k] = d[k] y[k] (1.1)

= (s[k]+w?x) w^?x (1.2)

 s[k]; (1.3)

(30)

+

+

+

+

near−end

speaker

far−end

speaker

Figure1.6: Stereoacousticechocancellation setup

hence theecho can be removed. The adaptive lter w^ is a self{designingsystem

that usesagradientalgorithm thatminimizestheerrorsignalenergy. Inthis way

agood replica ofthe unknownsystem wcanbeobtained. Apartfrom theability

toobtainagoodechopathreplica,timevariationsoftheacousticimpulseresponse

canbetrackedaswell,thankstotheadaptivity. However,accuratetrackingofthe

acoustic impulse response w is still a challenge even if fast and hence expensive

adaptive lteringstructuresareapplied[62][162]asacousticimpulseresponsesare

knownto behighlytime{varying(cf. experiment1.1).

Inmoreadvanced systemstwoormoreloudspeakerchannels havetobecancelled

asshownin gure1.6. Itcanbeproventhatstereoor|ingeneralmulti{channel|

acoustic echocancellation inherentlysu ers from anon{uniqueness problem[113].

Inpracticehowever,auniquesolutiontothestereoechocancellationproblemdoes

exist, but theunderlying optimization that drives the adaptive lters appears to

beseverelyill{conditioned. Severaltechniques weredevelopedthat copewith this

issue. Theytryto decorrelatethestereochannels byinsertionof nonlinearities in

thesignalpathsorbyapplyingpsycho{acousticnoisemaskingtechniques[58] [68]

[87][121].

Although commercialadaptiveecho controllersare available onthe market

nowa-days, providing amerely satisfactorysolution to the single{channel acousticecho

cancellation problem, further improvement and research will be necessaryin the

comingyears. It isfor instance clear that in thenear future there will be aneed

for N{channel acousticecho controllers (e.g. for stereo,surround systems, Dolby

Digital 5.1). Remark that the number of adaptive lters in an N{channel echo

cancellationsystemequalsN 2

(31)

mostlyoperateatratherlowsamplingrates(8kHz)higherqualitywillberequired

inthenearfuture(16kHz,orevenhigher). Asthecomplexityofanecho

cancella-tionsystemusingalinearadaptive lteringalgorithm,changesquadraticallywith

thesamplingrate,againeÆcientadaptivestructureswillbeneeded. Finally,there

will be a request for a better overall performance and morerobustness in highly

non{stationaryand complexacousticenvironments. This requires reliablecontrol

software,whichis addedtotheadaptive lteringscheme.

1.4.2 Noise suppression and interference cancellation

Single{channelnoisereductionmethodshavebeenknownforalongtimenow. They

exploitthecharacteristicsofspeechandthenoiseandenhance theSNR by

appro-priate (matched orWiener) ltering operations[149]. More advanced techniques,

commonlyusedtoday,relyonspectralsubtraction[11][182].

NoisesuppressionisadiÆcultproblem. Itisobservedthatthesignalofinterestand

thebackgroundnoisetypicallyoverlapbothinthetimeandinthefrequencydomain.

Thisiscertainly truewhen bothsignalsarespeech. Thesignalofinterestandthe

\noise"arethereforediÆculttoseparateifclassicalspectro{temporalenhancement

techniquesareemployed.

Itisobservedhoweverthatthecorrespondentandthebackgroundnoisesourceare

typically atadi erentposition in theconferenceroom. Hence, multi{microphone

techniques can be called for, which exploit the spatial information present in the

di erent microphone signals. This in general leads to spatio{temporal ltering

operationsandincreasestheperformance.

A rstclassofenhancementtechniquesthat relyonthisspatial diversityis

beam-forming. Thebeamformingidea comesfrom telecommunicationswhere it was

in-troducedtodesignantennaarrays. Lateritwassuccessfullyappliedtoacoustic

ap-plicationsaswell. Astheacousticenvironmentisinherentlytime{varyingadaptive

beamformingtechniquesare often called for. Broadbandbeamforming for speech

enhancementis stillatopic ofongoingresearch[17] [18][34] [66][74] [91][92] [97]

[122][123][125][136][150] [158][161][164] [165][172][174][175].

Morerecentlyoptimal ltering techniqueshave been proposed forthe suppression

of additivebroadbandnoise. These techniques relyonpowerfulmatrix

decompo-sitions such as theSVD and the Quotient SVD [33] [148]. They show asuperior

performancecompared toclassicalbeamformingapproachesbutare

computation-allymoredemanding.

Ifareferenceofthenoisesignalcanbeobtainedmorespeci csignalenhancement

techniquescanbeapplied. Forinstance,inthecaseofenginenoiseinacarthespark

signalcanbemeasuredandused tosuppress thenoiseinthe carcabin. Adaptive

(32)

Echo cancellation and noise suppression have been addressed independently for

many years now. Recently, it has beenrecognized that both problems are better

tackledin acombinedapproach, especially ifmulti{microphonesettingsare being

used. Initial results indicate that the combined approach yields a better

perfor-manceat alowercomputationalcost[1][31][63][102][103] [104][105].

Multi{microphone noise reduction schemes are being commercialized nowadays.

The systems that are available on the market howeverare typically rather basic

solutions with alimited numberof microphones and often relying onsimple, not

fullyadaptivesignalprocessing tools. There iscertainlyaneedfor morepowerful

and robustsystemswith ahigherperformanceat an acceptablecost in the

forth-comingyears.

1.4.3 Dereverberation

Ofthe threetypesofsignaldeteriorationthat occurin hands{freecommunication

dereverberationisleastprominent. However,inroomswithahighre ectivity

rever-beratione ectshaveaclearlynegativeimpactontheintelligibility. Dereverberation

techniqueshavebeendevelopedoverthelastyearsbutthesolutionsavailabletoday

arenotyetsatisfactory.

Single{channel dereverberation techniques werereported rst. Inverse lteringcan

becalled for, by tryingto invert theacoustic impulse response. However, asthe

impulseresponsesare knownto benon{minimumphasesystemstheyhavean

un-stableinverse[112][120]. Cepstrum{based techniques aremorepromising [6][126]

[131]andrelyontheseparabilityofspeechandtheacousticsinthecepstraldomain.

Throughmulti{channelprocessingthespatialdiversityofthehands{freesetupcan

be exploited, in general leading to abetter performance. Acoustic beamforming

techniquesarebeingused,asapartfromnoisesuppressiontheyareknownto

par-tiallydereverberatethesignalsaswell. Asecondclassofmulti{channel

dereverbera-tiontechniquesisbasedoncepstralprocessing. Itwasshownthatthesingle{channel

cepstralbaseddereverberationalgorithmscanbeextendedtothemulti{channelcase

[96].

Matched ltering algorithms were reported in [2] [167]. They rely on subspace

trackingtechniques. Thesealgorithmsshowanimproveddereverberationcapability

with respect to classical approaches but as some environmental parameters are

assumedtobeknowninadvancetheseapproachesmaybelesssuitableinpractical

applications.

Duringthe lastyearsMIMO blind system identi cationtechniqueshavebeen

de-veloped for equalization in digital communications [80] [118] [163] [166]. These

(33)

1.5 Outline of the thesis and contributions

In this section an outlineand anoverview of the thesiscan be found. The main

contributionsaresummarizedandreferenceswillbegiventothepublicationsthat

were broughtaboutin theframeofthis work.

1.5.1 Motivation

Inthisthesissubbandandfrequency{domainadaptive lteringtechniquesare

stud-ied, putting forward acousticecho cancellation as a possible and straightforward

application.

Acousticechocancellation,aswellasothersignalenhancementproblemsinhands{

freecommunication,dealswiththeretrievalofdegradedspeechembeddedin\noise".

Toenhance the speech signaltheacousticsofthe recordingroomneed to be

esti-mated. In section 1.3 we discussed some properties of speech and the acoustic

environmentthat imposespeci cconstraintsonthesignalenhancementalgorithm

thatisused. Itwasforinstanceobservedthatacousticimpulseresponsesaretime{

varying high{order systems. It was further indicated in section 1.4.1 that there

willbeaneedfor(multi{channel)acousticechocontrollersin thenearfuture that

o erahighperformanceat increasingsamplingrates. Hence,computationally

eÆ-cientand adaptive algorithmic solutionsshould be called for. Finally, as thetime

envelope and thespectral content of speech are continuously changing time{ and

frequency{dependent signalprocessingisrequired.

Itwillappearintheforthcomingchaptersofthethesisthatsubbandandfrequency{

domainadaptive lteringtechniquesmeetalltherequirementsspeci edabove,

com-biningadaptivityandfrequency{dependentprocessing,and o eringahigh

perfor-manceatalowcost. Hence,subbandandfrequency{domainadaptive lterswillbe

putforwardasbeingappropriateapproachestosolvetheacousticechocancellation

problem.

It isnotonly ourobjectiveto presentexisting and novelsubbandand frequency{

domainadaptivesolutionsforacousticechocancellation,wewillalsodwellonthe

structures and principles that lie behind these techniques, in an attempt to get

moreinsight in theunderlying fundamentals. Whereas acousticecho cancellation

was presentedas the startingpoint and amotive forthis research, themain part

of thetext dealswithsignalprocessing assuch. Thepresentedtechniques can be

employedin manyapplications,goingfarbeyondacousticechocancellationalone.

1.5.2 Chapter by chapter overview and contributions

(34)

Theintroductory andconcludingchapterareomittedhoweverinthe gure.

Inchapter2somebasicconceptsarediscussedandthenecessarysignalprocessing

toolswillbepresentedtounderstandthemainpartofthetext.

Part I : DFT modulated lter bank design for oversampled subband

systems

It wasmotivated in this introductory chapter that frequency{dependentadaptive

signalprocessingisrequiredforadequateacousticechosuppression. Frequency

de-pendencycanbeachievedthroughtheuseofdigital lterbanksandtheintegration

ofthese structuresin existing adaptive lteringschemes, leadingto so{called

sub-bandadaptive lters. Ingeneralhowever,digital lterbanksintroduceconsiderable

signalandaliasingdistortion.InpartIofthethesisdesignmethodsforperfectand

nearlyperfectreconstructionDFTmodulated lterbanksarediscussed. These

l-ter banks introduce noor almostno signaldistortion and are easily integrated in

subbandadaptive lteringstructures.

Inchapter 3designmethods forperfect reconstructionoversampledDFT

modulated lter banks are presented. A para{unitary lter bank design

method is discussed,which waspresentedin [22]. With this method

how-evertheorderofthe lterbankscannotbeadjustedaccurately. Wepresent

an extensionto this method, which basically allowsto chooseanydesired

lter length. Further, weshow that based onthe inverse parametrization

of the lter bankparametersappropriate startingvaluescanbeobtained,

which reducestheoptimizationtime.

Thestopbandattenuationofperfectreconstruction lterbanksistypically

unsatisfactoryifintermediateoperations,suchasadaptive ltering,are

per-formed on thesubband signals. In chapter 4, theperfect reconstruction

condition is relaxed to nearly perfect reconstruction. Both a frequency{

domainandamixedtime/frequency{domainbaseddesignmethod are

pre-sentedfornearlyperfect reconstructionDFTmodulated lterbanks.

Sub-band adaptive lteringis takenasanexampleto illustrate that thanksto

theirlowerstopbandlevelnearlyperfectreconstruction lterbanks

outper-formperfectreconstructionsystems.

Publicationsrelatedtothe rstpartofthethesisare[43] [45][52].

Part II : Subbandand frequency{domain adaptive ltering

In section1.5.1 subband and frequency{domainadaptive lterswere put forward

(35)

Part II

Part I

Chapter 2

Basic Concepts

Chapter 3

Filter Bank Design

Perfect Reconstruction

Chapter 5

Subband Adaptive Filtering

Chapter 7

Fullband Error Adaptation

Partitioned Block

Frequency−Domain

Adaptive Filtering

Chapter 6

Chapter 4

Filter Bank Design

Nearly Perfect Reconstruction

Frequency−Domain RAP

Chapter 8

Partitioned Block

Frequency−Domain RAP

Chapter 9

Fast Partitioned Block

Chapter 10

Experiments

Acoustic Echo Cancellation

Part III

Part IV

(36)

in more detail and discuss some of their properties. Although both approaches

weredevelopedindependentlyintheliteraturetheyarestronglyconnectedtoeach

other. We will focus on the interrelation between both techniques and combine

theirmechanismsto obtainimprovedalgorithmicstructures.

The subband adaptive lter is discussed in chapter 5. A comparison is

made between the subband approach and standard fullband adaptive

l-ters in terms of complexity and performance. It will be shown that

sub-bandadaptive lteringstructuressu erfromaconsiderableresidual

under-modelling errorunlessextra(anti{)causalsubband ltertapsareinserted.

Although the complexity gain w.r.t. the fullband approach is less than

expected, still a considerable cost reduction can be obtained. Next, we

formulate three design criteria for subband adaptive systems, which deal

with frequency selectivity, perfect reconstruction and perfect path

mod-elling. These conditions are necessaryrequirements to ensuresatisfactory

performanceofthesubbandadaptive lter.

In chapter 6 the partitioned block frequency{domain adaptive lter

(PBFDAF)isstudied. Itappearsthatthisalgorithm,whichisknownfrom

theliteratureforsomeyearsnow,outperformsstandardsubbandsystemsin

termsofconvergencebehaviorandmodellingcapabilities. Itwillbeproven

that thePBFDAFcanbe consideredasaspecialsubband adaptive

lter-ingstructure,whichful lls twooutofthethree designcriteriaforsubband

adaptivesystemsthatarespeci edinchapter5. Itisfurthershownthatthe

frequency{domainadaptive lterreliesonaspecialerrorcorrection

mecha-nism. Thankstotheerror correctionthe ltercoeÆcientscanbe updated

withaliasing{freeerrorsignals,whichleadstoimprovedperformance.

Inanattempttogeneralizetheerrorcorrectionmechanismofthefrequency{

domainapproachtosubbandadaptivesystemsweproposeanovelfullband

error adaptation scheme for subband adaptive lters in chapter 7. The

alternativeadaptationschemeadjuststhesubband ltersbasedonthe

full-band error instead of using the subband errors, as is done in a classical

subband adaptivesystem. Inthis wayimprovedperformanceis obtained.

It is shown that for some common parameter settings the weight update

mechanism of the so{called unconstrained PBFDAF corresponds to that

ofthe fullband erroradaptationalgorithmpresentedinthis chapter. This

provesthatthefullbanderroradaptationalgorithmcanbeconsideredasan

extension of the frequency{domain error correction mechanism to a more

generalclassofsubbandadaptive lters.

(37)

PartIII: Iteratedpartitionedblockfrequency{domainadaptive ltering

InpartIII anextensiontothePBFDAFisproposed,called thePBFDRAP,which

is an adaptive ltering algorithm combining partitioned block frequency{domain

adaptive lteringwithso{called rowactionprojection. Thealgorithmispresented

andanalyzedandfastimplementationschemesarederived.

Inchapter8thePBFDRAPisde ned anditisexplainedhowextraerror

suppression canbeachievedw.r.t. thePBFDAF. Further, theasymptotic

propertiesof thealgorithmareanalyzed: forsomeparametersettingsthe

PBFDRAPalgorithmapproacheswell{knownadaptive lteringalgorithms.

Finally, it is shown that the PBFDRAP outperforms the PBFDAF in a

realisticechocancellation setup.

FastimplementationsarederivedforthePBFDRAPalgorithminchapter

9. Thedi erentfastimplementationschemesarecomparedwith the

stan-dard implementation of the PBFDRAP for di erent parameter settings.

It appears that a signi cant complexity reduction canbe obtained. The

PBFDRAPadaptive lteristhencomparedwiththePRA algorithmfrom

a computational complexity point of view. It is seenthat for large block

lengthsthePBFDRAP isacheaperalternativetothePRA.

Publicationsrelatedtothispartare[50][54][55].

Part IV : Acousticecho cancellation, implementationand experiments

Inthe nalpartofthethesistheacousticechocancellationproblemisrevisited. It

was pointedoutin section1.4.1that in thenearfuture there willbearequestfor

morerobustacousticechocancellationschemeso eringabetteroverallperformance

inhighlynon{stationaryandcomplexacousticenvironments. Thisrequiresreliable

control software, which is added to the adaptive ltering scheme to monitor the

adaptationspeed.

Chapter 10illustrates how thedi erentadaptive ltersdevelopedin the

preceding chapters canbe applied to an acousticecho cancellation setup,

providingthemwithcontrolandso{calleddouble{talkdetectiontechniques,

known from the literature. Several experiments are discussed, di erent

adaptive ltering solutions are compared and some observations

concern-ing a real{time implementation of an acoustic echo canceller on DSP are

presented.

Publicationsrelatedtothispartare[44][46].

(38)

1.6 Conclusions

Inthe rstsectionofthischaptertheeconomicimpactoftelecommunication

tech-nologyandhands{freecommunicationinparticularwashighlightedanda

motiva-tionwasgivenforthework thatwas performedin theframeofthisthesis.

Insection 1.2 hands{freecommunicationwasde ned,examples were givenand it

was pointedoutthat di erentsortsofsignaldegradationdooccur.

Insection1.3somebasicsofspeechandacousticswerestudied.

It wasshown in section1.4 that a largevarietyof signalenhancement techniques

are known from theliterature. They can beemployedin present{day hands{free

communicationsystemstoobtainabettersignalquality.

In section1.5 an outline and anoverview was given of the di erent chapters and

(39)

Basic Concepts

Inthischaptersomebasicconceptsarediscussedandthenecessarysignalprocessing

toolsarepresentedtofullyunderstand theforthcomingchaptersofthethesis.

Manyofthealgorithmsdescribedinthis thesisare so{calledblockbasedadaptive

lters. Ofteninthiskindofalgorithmssignalswithdi erentsamplingratescoexist,

hencethename multiratesystems. Insection2.1 somebasicsofsignalprocessing

andofmultiratesignalprocessinginparticulararethereforediscussed.

PartIofthethesisfocusesondigital lterbankdesign. These lterbankscanthen

beintegratedinthesubbandadaptive lteringstructuresthatarediscussedinpart

II. Section 2.2 thereforediscussessome lterbank fundamentals and presentsthe

necessarybackgroundinformationthat isneededtofullyunderstandpartI andII

ofthiswork.

ThealgorithmspresentedinpartIIandIIIareadaptive lters. Abriefoverviewof

existingadaptive lteringtechniqueswillbegivenin section2.3.

For many of thealgorithms that are discussed further on, acost analysis will be

performed. Theassumptions wewillmakeforthese costanalysesaresummarized

insection2.4.

(40)

2.1 Signal processing basics

2.1.1 Representation of variables

Mostofthesignals, ltersandsystemsthatarereferredtointhisthesisarediscrete{

timevariables. Theyarerepresentedinthetime,thefrequencyorinthez{domain.

Thetime{domainrepresentationofavariableh

h[k]=f ::: h[ 1] h[0] h[1] h[2] ::: g (2.1)

dependson thediscrete timek, which relatesto theactual time t=k=f

s

viathe

samplingfrequencyf

s .

H(z)isthez{transformofh[k]and isde ned as

H(z)= 1 X k = 1 h[k]z k : (2.2)

An in{depthdiscussionof theuseandvalidityofthez{transformcanbefound in

manybooks onsignalprocessing[126][134]orcontrol theory[65][110].

By evaluating H(z) on the unit circle, i.e. replacing z by e j2f

in Eq. 2.2, the

frequency{domainrepresentationofh[k]isobtained:

H(f)= 1 X k = 1 h[k]e j2k f : (2.3)

H(f) is periodic in the frequency f 2 IR . For the evaluation of the frequency{

domain characteristics the fundamental interval (1 period) is usually considered,

i.e. 1 2 < f 6 1 2

, in which f = 0:5 corresponds to the Nyquist frequency. The

inversefrequency{domaintransformation

h[k]= Z 1 2 1 2 H(f)e j2k f df (2.4) computesh[k] fromH(f).

2.1.2 Multirate signal processing

In many of the adaptive ltering algorithms discussed in this thesis signals with

di erentsamplingratesareencountered. Asdi erentsamplingratescoexistwithin

thesamealgorithmtheseadaptive lteringstructuresarecalledmultiratesystems.

(41)

Tofullydescribeamultiratesystemin thetimedomainseveraldiscrete{time

vari-ables should be de ned and used in parallel. It is however more convenient to

representthese systemsin thez{domain.

Todescribetheconversionfrom onesamplingrateto another,twooperationswill

bediscussedhere: thereductionofthesamplingratewithanintegerfactor,called

downsampling,andtheincreaseof thesamplingratewithanintegerfactor,which

isreferredto asupsampling.

Downsampling

f[m]isanN{folddownsampledversionofh[k]if

f[m]=h[k]

N#

=h[mN]; 8m2ZZ;N 2IN

0

: (2.5)

InthiswaythesamplingrateisreducedbyafactorN. Insignal owgraphsN{fold

decimatorsordownsamplersarerepresentedas N # . Itcanbeshown[156]that

F(z)= 1 N N 1 X n=0 H(z 1=N e j 2 n N ) (2.6)

holds,in whichf[k] !F(z)andh[k] !H(z)arez{transformpairs.

Upsampling

f[m]isanN{foldupsampledversionofh[k]if

f[m]=h[k] N" =  h[m=N] ifm=pN; 8p2ZZ;N2IN 0 0 otherwise, (2.7)

which increases the sampling rate by a factor N. In signal ow graphs N{fold

expandersor upsamplersaremarkedas N " . It canbeshown[156] that

F(z)=H(z N

): (2.8)

Bothoperationsintroduceartifacts. Inthecaseofdownsamplingaliasingisadded

to thesignal. Upsamplinginvokesso{calledmirrorfrequencies. Moreinformation

about this and how to get rid of theartifacts can be foundin any good book on

(multirate)signalprocessing,e.g. [156].

2.1.3 Some de nitions related to matrix algebra

Inappendix A afew matrixalgebrade nitions and propertiesare combined that

will be used and referred to in the forthcoming chapters of the thesis. A good

(42)

+

...

...

x

intermediate

processing

analysis lterbank synthesis lterbank y H 0 H 1 H M 1 G 0 G 1 G M 1 N N N N N N

Figure2.1: Ageneralsubbandscheme: allintermediateoperationscanbedoneat

thedownsampledrate,whichtypicallyleadstoareducedimplementationcostand

improvedperformance.

2.2 Filter bank basics

Filterbanksarewidelyusedindigitalsignalprocessing[156]. Typicalapplications

are subband coding [87] [169] and subbandadaptive ltering [53] [142]. Subband

techniquescanimprovetheperformanceofstandardfullbandalgorithmsforspeech,

audioorimageprocessing,astheyallowanoptimaltuningofthealgorithmineach

subband. Inthiswaysubbandalgorithmsoftenoutperformtheirgloballytuned

full-bandcounterparts. Furthermore,byusingmultiratetechniquestheimplementation

costcantypicallybereduced.

2.2.1 General subband scheme

Ageneralsubbandschemeisshownin gure2.1. Theso{calledanalysis lterbank

splitstheinputxinasetofsmallbandsignals: theinputis lteredwitheachofthe

M analysis ltersH

0 ;:::;H

M 1

andeachsubbandsignalis N{folddownsampled.

Hence, intermediate operations can be performed on the subband signals at the

downsampled rate and this typically leads to acheaperimplementation. Finally,

arecombination operationtakes place in thesynthesis lter bank G

0 ;:::;G

M 1 ,

whichoperatesontheN{foldupsampledsubbandchannelsandresultsintheoutput

y.

A lter bank is a set of parallel lters, which each lter out a part of the

fre-quencyspectrum. Ifall ltershavethesamebandwidththe lterbankissaidtobe

(43)

perceptionandthereforethebandwidthof thedi erent ltersischangedin a

log-arithmicway. Non{uniformlyspaced lterbanks areoftentree{orwavelet{based

[156][169]and mightbemoresuitableforapplications such asaudioor video. In

thisthesisonlyuniformlyspaced lterbankswillbeconsidered.

Uniformlyspaced lterbanksaretypicallyobtainedbymodulating,i.e. frequency

shiftingawell{designedlowpassprototype lter. Hence,theyarecalledmodulated

lterbanks. Eachoftheanalysis ltersH

0 ;:::;H

M 1

then ltersoutapartofthe

frequencyspectrumandasthereareMsubbandsintotalthebandwidthofeachof

theanalysis ltersisequalto(orlargerthan) f

s

M

,inwhichf

s

representsthesampling

frequency corresponding to theinput signalsignal x. Modulated lter bankscan

easilybeimplementedbydecomposingtheprototype lterinpolyphasecomponents

andapplyingaDFTorDCToperation(seesections2.2.3,3.1and[156]). Thelatter

operationscanbeimplementedeÆcientlyusingfastsignaltransforms.

Critically downsampledsubbandschemes

Forauniformlyspacedmodulated lterbankthebandwidthofeachofthe ltersis

largerorequalto fs

M

. Hence,ifthedownsamplingfactorN islargerthanM a

con-siderableamountofaliasingwillbeinsertedin thesubbandsbythedownsampling

operation. Aliasing is often detrimental for the performance of the intermediate

subband operations (e.g. subband adaptive ltering, see also gure5.10). Hence

in practice, N is restrictedto be smaller or equal to M, with M = N being an

upper bound for N. A subband system for which M = N is called a critically

downsampled subband scheme. In thiscase theimplementation costcanbe

opti-mallyreduced: theintermediateoperationsandin manycasesalsothe lterbank

operations(see section 2.2.3)can be doneat the lowest sampling rate, asN is as

largeaspossible.

Oversampledsubband schemes

Inpracticehowever, niteorder lterbanks haveto be usedin order to limitthe

processing delay and the computational complexity. Finite order lters have a

non{negligible transition bandwidth and therefore aliasing will beinserted in the

subbandseven ifthedownsampling factorN is smallerthanM. Critically

down-sampled niteordersubbandschemesarestronglysensitivetoaliasingandinmany

cases theloss in performance due to the critical downsampling is not acceptable.

Fromthispointofviewoversampledsubbandschemes(M >N)aremoreattractive

astheytradeo betweencomplexityreductionandaliasingdistortion.

2.2.2 Modulated lter banks

(44)

−0.5

0

−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

0.2

0.4

0.6

0.8

1

digital frequency

frequency amplitude response

DFT modulated filter bank

m=0

m=1 m=2

m=2 m=3

Figure2.2: 4{bandDFTmodulated lterbank: frequencyamplituderesponse

The M subband lters of a DFTmodulated lterbank are derived by frequency

shiftingawell{designedlowpassprototype lterh

0 [k]oflengthL f inthefollowing way: h m [k] = h 0 [k]e j 2 k m M ;  m=0:M 1 k=0:L f 1 (2.9) () H m (z) = H 0 (e j 2 m M z) (2.10) () H m (f) = H 0  f + m M  : (2.11)

Equation 2.10followsfrom Eqs. 2.2 and 2.9 and Eq. 2.11can beobtained from

Eq. 2.10byreplacingz bye j2f

. The ltersarefrequencyshiftedversionsofeach

other andthe completeset of M lterscoversthewhole frequency spectrum. An

exampleof aDFTmodulated lterbank,withM=4isshownin gure2.2.

AvariantofthisistheInverseDFT modulated lterbank,whichisde ned as

h m [k] = h 0 [k]e j 2 k m M ;  m=0:M 1 k=0:L 1 (2.12)

(45)

−0.5

0

−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

1

2

3

4

5

6

digital frequency

frequency amplitude response

DCT modulated filter bank

m=0 m=1 m=1 m=2 m=2 m=3 m=3

Figure2.3: 4{bandDCTmodulated lterbank: frequencyamplituderesponse

Thez{transformandfrequency{domainrepresentationare givenby

H m (z) = H 0 (e j 2 m M z): (2.13) () H m (f) = H 0  f m M  : (2.14)

IDFTmodulationdi ersfromDFTmodulationonlyintheorderinwhichthe lters

arefrequencyshifted.

Apartfrom DFTmodulated lterbankscosineorDCTmodulated lterbanksare

often used. Based on a well{designed FIR prototype lter p[k] of length L

f the

di erentDCTanalysis ltersh

m

[k]canbederivedasfollows[156] :

h m [k]=2p[k]cos   M  m+ 1 2  k L f 1 2  +( 1) m  4  ; m=0:M 1: (2.15)

An example ofa DCTmodulated lter bank, with M =4is shownin gure 2.3.

WhereasDCTmodulated lterbanksaretypicallyusedincriticallydownsampled

(46)

modu-frequentsubbandstendtooverlapwhenthesubbandsignalsarenot{critically

down-sampled. In this way alarge amount of aliasing is inserted in the subbands. As

it is our goal to design oversampled lter banks (M > N)that introduce onlya

small quantityof subbandaliasing, standardDCTmodulated lterbanks arenot

applicable. Someschemes havebeenproposed that combine real lter bankswith

unequalsubsamplingindi erentbandsto overcomethisproblem[78].

In thecase ofDFT modulated lterbanks onthe other hand the subband lters

lter out asingle contingent frequency region. If there are M subbands and the

lowpassprototype lterhasagoodstopbandrejection,thebandwidthofthe

band-passedsignalsthat are lteredoutby eachofthesubband ltersis approximately

f

s

M

. Hence,the subband signalsarecorrectly projectedinto thenew fundamental

interval[ f s 2N ; f s 2N

] bythedownsamplingoperationifM >N,avoidingsevere

alias-ing distortion. As a consequence, oversampledsubband schemes are often based

on DFT modulated lter banks because of their aliasing robustness and ease of

implementation.

2.2.3 Polyphase implementation

The analysis and the synthesis lter bank are immediately followed, respectively

precededbydownsamplingorupsamplingunits(see gure2.1). Hence,itischeaper

to do notonly theintermediate processing, but also the lterbank operations at

thedownsampledrate,whichcanbeachievedthroughpolyphasedecomposition.

Analysis bank

IfthesignalspassingthroughtheM{bandanalysisbankare subsequentlyN{fold

downsampled, each subband lter h

m

[k] can be decomposed in its N{th order

polyphasecomponents: H m (z)= N 1 X n=0 z n H m n:N (z N ); (2.16) inwhichH mn:N

(z)isthen{thoutofN polyphasecomponentsofthem{thsubband

lterh

m

[k],in otherwordsthez{transformofh

m

[n+Nk],k=0;1;::: :

SwappingthepolyphasecomponentsandthedownsamplersleadstoamoreeÆcient

implementation. The ltering operationscan now be done at the lower sampling

rate,asshownin gure2.4 forthem{thsubband.

The analysis bank can now schematically be represented as shown in gure 2.5.

H(z)iscalledtheanalysispolyphasematrix[156]. Element(m;n)ofH(z)is

[H(z)] m;n =H mn:N (z)  m=0:M 1 n=0:N 1 (2.17)

(47)

+

+

...

...

...

...

  x x x x m x m x m z 1 z 1 z 1 z 1 N N N N N H m (z) H m 0:N (z N ) H m1:N (z N ) H m N 1:N (z N ) H m 0:N (z) H m 1:N (z) H m N 1:N (z)

Figure2.4: Analysis lterpolyphase decomposition

Synthesispart

Forthesynthesispartasimilarderivationcanbemade. Bypolyphase

decomposi-tion G m (z)= N 1 X n=0 z n G mn:N (z N ); (2.18)

andswappingthepolyphasecomponentsandtheupsamplers gure2.6isobtained.

All N{th order polyphase components are contained in the synthesis polyphase

matrixG(z)[156]: [G(z)] m;n =G mn:N (z):  m=0:M 1 n=0:N 1 (2.19)

Bycombiningtheanalysisandsynthesispart gure2.1canbere{arranged,resulting

in gure2.7(omittingtheintermediateprocessingforawhile). Itisobservedthat

for the analysis part the analysis polyphase matrix H(z) is used whereas at the

synthesissideJG T

(z)is found. Jistheexchangematrixwith onesalongitsmain

Referenties

GERELATEERDE DOCUMENTEN

geisoleerde voorzetgevel afgewerkt met houten gevelbekleding (horizontaal) potdekselwerk of rabatdelen (oogsthout) met klimplanten langs geleidedraad geisoleerde. voorzetgevel

In verband met het coronavirus mogen maximaal 30 kerkgangers de dienst bijwonen (diegenen die een functie hebben niet meegerekend). Wilt u op Paasmorgen helemaal

Kerst, Kerst, prachtige Kerst, schijn over sneeuwwitte wouden, als hemelse kroon met sprankelend licht, als glanzende boog over elk huis van God;.. psalmen die eeuw na eeuw zingen

“wie zit er nou op mij te wachten?” Daarom strooien we met onze kennis. Wat zeg ik: bombarderen we onze toehoorders met kennis, want stel je voor dat het te weinig waarde is die

Dit document voor ouders is bedoeld om jou als ouder concrete tips te geven hoe je aan de veerkracht van je kind - en jezelf - kunt werken. In deze uitgave gaan we specifiek in

1.5.2 De gegevens die in het kader van de stamboekregistratie zijn verzameld worden door Coöperatie CRV beheerd en worden binnen de kaders van de AVG (Algemene

Dit geldt met name voor de Britten en Chinezen, waar bijna drie kwart van de mensen die niet op vakantie gaan aangeeft dat de uitbraak van het coronavirus van (zeer) grote invloed

Ze voorziet voor beide kerken in de toekomst geen enkele liturgische of pastorale functie, en besliste dat voor beide kerken niet later dan in 2018 de. aanvraag tot onttrekking aan