Making Sense of Sound

(1)

Making Sense of Sound

ESAT Inaugural Lecture 2014

Prof. dr. ir. Toon van Waterschoot

Faculty of Engineering Technology

ESAT – Department of Electrical Engineering

(2)

Academic Career

1979: born in Lier, Belgium

1987: started playing music & being fascinated by sound 1996: decided to study Electrical Engineering

(for specializing in audio signal processing) 2001: graduated as Master in Electrical Engineering 2002: started teaching at Antwerp Maritime Academy 2003: started PhD at KU Leuven (Prof. Marc Moonen) 2009: graduated as PhD in Electrical Engineering

2010: moved to TU Delft as postdoc (Prof. Geert Leus) 2011: returned to KU Leuven as FWO postdoc

(Prof. Marc Moonen & Prof. Moritz Diehl)

(3)

Affiliations

•  KU Leuven, Belgium: Assistant Professor

  STADIUS Center for Dynamical Systems,

Signal Processing, and Data Analytics

  AdvISe – Advanced Integrated Sensing Lab

(KU Leuven Campus Geel)

•  TU Delft, The Netherlands: Visiting Researcher

  Circuits & Systems Group

•  University of Lugano, Switzerland: Visiting Lecturer

(4)

Teaching

•  KU Leuven, Faculty of Engineering Technology

  Control Theory

  Digital Signal Processing

  Databases & Web Technology (2014-2015)

•  KU Leuven, Faculty of Engineering Science

  Optimization (2014-2015)

•  University of Lugano, Faculty of Informatics

  Digital Signal Processing

(5)

Research Projects

•  FWO Postdoctoral Research Fellowship

A Signal-Oriented Approach to Acoustic Signal Enhancement

•  FP7 Marie Curie Initial Training Network “DREAMS”

Dereverberation & Reverberation of Audio, Music, and Speech

•  IWT O&O Project “Cochlear-IV”

Signal Transmission Schemes for Auditory Implants

•  KU Leuven Programme Financing “OPTEC”

Optimization in Engineering Center

•  IWT Strategic Basic Research Project “ALADIN”

Adaptation & Learning for Assistive Domestic Vocal Interfaces

•  IWT Strategic Basic Research Project “SINS”

Sound Interfacing Through the Swarm

Dutch title Gecomprimeerde bemonstering voor draadloze communicatie

English title Compressive Sampling for Wireless Communications GENERAL

Title of your PhD dissertation

Scientific field

Date of the application

Science and Technology

Informatics and Knowledge Technology (W&T5) FWO Expert Panel

The proposed research project deals with the power spectrum cartography for cognitive radios based on finite element method. The project is related to a number of major research fields but more importantly to ``Telecom and computer networks” and ``Parallel and distributed computing”, which lie in the scope of expert panel W&T5.

Motivation of panel choice

Pegasus - Short

Dutch title Eindige-elementen-gebaseerde vermogenspectrumcartografie voor cognitieve radiocommunicatie

English title Finite element based power spectrum cartography for cognitive radios

Title of your research proposal

1

(6)

Research Team

Enzo De Sena Postdoc DREAMS Giuliano Bernardi PhD Student Cochlear-IV Giacomo Vairetti PhD Student DREAMS Niccoló Antonello PhD Student DREAMS Gert Dekkers PhD Student ALADIN/SINS Mina Shehata PhD Student ALADIN/SINS Pablo Peso PhD Student Nuance/Imperial DREAMS Ante Jukić PhD Student Univ. of Oldenburg DREAMS Neo Kaplanis PhD Student B&O/Aalborg Univ. DREAMS

(7)

Sound and Electrical Engineering

•  Sound = pressure wave propagating

in medium (air)

•  Pressure wave can be converted

into electrical voltage and vice versa

  microphone

  loudspeaker

•  Audio signal processing:

electrical circuit

sound signal electrical signal electrical signal sound signal

longitudinal sound wave

so un d pre ssu re © In st itu re fo r So un d an d Vi bra tio n R ese arch , So ut ha mp to n U ni ve rsi ty position

(8)

analog electrical

signal

• Digital audio signal processing:

  analog-to-digital (A/D) conversion: sampling + quantization

  algorithm = analysis/manipulation of sound by means of

mathematical operations (+−×÷) & matrix algebra analog

electrical signal

Digital Signal Processing

elektrische schakeling digital electrical circuit A/D D/A sound signal digital electrical signal amp lit ud e continuous time 0 1 2 3 x(3) x(2) x(1) … … x = 2 6 6 6 4 x(1) x(2) x(3) .. . 3 7 7 7 5 discrete time digital electrical signal sound signal

(9)

Challenges in Digital Audio Processing

• Modeling:

  sound = natural/physical phenomenon

  no exact/unique signal models available

• Processing:

  sound = broadband signal

  high sampling frequency

  high computational processing load

• Evaluation:

  what really matters is: “How does it sound?”

  subjective measure that is hard to objectify

  listening tests: time-consuming & expensive

(10)

Audio research is by definition multi-disciplinary research!

Signal

processing

Machine

Learning

Room

Acoustics

Psycho-acoustics

Numerical

Optimization

Making Sense

of Sound

(11)

Making Sense of Sound

• How to improve the perceived quality of sound?

Audio signal enhancement

• How to model the interaction between sound

and environment?

Acoustic modeling

• How to extract information from sound?

Audio analysis

• How to create an immersive and healthy sound

experience?

Audio reproduction PAST PRESENT FUTURE Acoustic signal

(12)

Acoustic feedback control

• Acoustic feedback control:

  acoustic feedback from loudspeaker to microphone howling

  adaptive feedback canceller estimates acoustic room model

& removes acoustic feedback without distorting source signal

  structuring 5 decades of acoustic feedback research

Acoustic signal

enhancement modeling Acoustic analysis Audio reproduction Audio

room room

model

+ −

(13)

Acoustic echo cancellation

• Acoustic echo cancellation

  echo removal by adaptive estimation of acoustic room model

  increased double-talk robustness (when both speakers active)

  joint estimation of room model & nonlinear loudspeaker model

Acoustic signal

(14)

Clipping & loudspeaker compensation

• Clipping & nonlinear loudspeaker distortion

  analog audio equipment (amplifiers, loudspeakers) as well as

digital signal representations have limited dynamic range

  hard or soft clipping precompensation signal distortion

  perceptually optimal clipping & loudspeaker precompensation

Acoustic signal

D/A perceptual audio signal model numerical optimization algorithm

(15)

Dereverberation

• Multi-channel blind dereverberation

  far-talk microphones record reverberant instead of “dry” sound

  dereverberation = removing effect of room acoustics

  joint source & channel estimation = underdetermined problem

  prior knowledge: speech model, room decay, …

Acoustic signal

joint source & channel estimation prior knowledge room model

(16)

• Room impulse response (RIR)

• Problems

  room impulse responses are very long (103–104 coefficients)

  truncation results in poor spectral accuracy

  high spatial variability

  inefficient representation of perceptual attributes

Room modeling

Acoustic signal

(17)

• Room modeling using orthonormal basis functions (OBFs)

  RIR: linear combination of many unit impulse basis functions

  OBF model: linear combination of few “long” basis functions

• Scalable OBF model building algorithm:

  dictionary of OBFs from “overcomplete” pole grid

  select poles using sparsity-promoting criterion

Room modeling: Orthonormal basis functions

Acoustic signal

(18)

• Room modeling using numerical acoustics

  modeling of complete space-time acoustic wave field

  space-time sampling acoustic wave equation (FEM/FDTD)

• Wave field sensing

  use microphone measurements in numerical model

  source & field estimation using sparse optimization

Room modeling: Numerical acoustics

Acoustic signal

The Finite Difference Time Domain Method

Second order partial derivatives are approximated as:

Sound pressure and source signals are sampled uniformly: Wave equation in 2D:

Room Acoustics: Wave Equation

where: Wave equation Boundary conditions: Initial conditions: Inverse problem

Forward problem: is known, is wanted

Inverse problem: is partially known(*), is wanted underdetermined system

Exploit Sparsity of !

(*) Microphone measurements in the room

Assumptions:

● Geometry of the room is known

● Boundary conditions are known

(19)

• Room modeling using image method (1979)

  room boundary reflections modeled by image sources

• Perceptual performance of image method

  despite 2000+ citations, perceptual evaluation is lacking

  time alignment of image sources “flanging” effect

  solution: small random displacement of image sources

Room modeling: Image method

Acoustic signal

5 10 15 20 Time [s] Frequency [kHz] 0 0.2 0.4 0.6 0.8 1 0 -50 -100 Magnitude [dB]

(20)

• Perceptual attributes of room acoustics

  broad literature on concert halls, few results for small rooms

• Perceptual models for small room acoustics

  identification & classification of perceptual attributes

  measurement & simulation of small room acoustics

  subjective listening tests link physics and perception

Room modeling: Perceptual models

Acoustic signal

(21)

• Ear modeling

  acoustic/mechanic sound propagation from inner to outer ear

  cause of acoustic feedback in acoustic hearing implants

• How to characterize inner-to-outer-ear response?

  in-vitro measurements on cadaver heads

Ear modeling

Acoustic signal

3 – CochlearR _device 18/31 CochlearR Codacs Actuator Fixation system Implant body

AdvISe ProjectClub – Giuliano Bernardi

3 – Setup 20/31

3 – CochlearR _device _18/31 CochlearR Codacs Actuator Fixation system Implant body

(22)

MS. AMSTERDAM DECLINED TO FORM

• Automatic speech recognition (ASR)

  applications: speech-to-text, voice control, surveillance, …

  standard ASR fails in background noise & reverberation

  back-end solution: C₅₀-based acoustic model selection

  front-end solution: noise reduction & dereverberation

Robust speech recognition

Acoustic signal

ASR

LAST YEAR EARLIER THE PLATFORM

C₅₀ estimation

(23)

• Acoustic event detection

  acoustic event: change in physical condition of sound source

  applications: music transcription, acoustic surveillance, …

• Event detection using sparse signal representations

  two-class dictionaries with steady-state and transient atoms

  criteria to search for most efficient (sparse) representation

  detection based on relative importance of two classes

Acoustic event detection

Acoustic signal

(24)

Model-based audio analysis

Acoustic signal

Audio analysis

Applications

•  Hearing aids •  Music technology •  Bio-monitoring

•  Personal audio devices •  Ambient assisted living •  Music recommendation •  Acoustic surveillance •  Multimedia archiving Audio analysis Problems •  Modeling •  Estimation •  Detection •  Classification •  Separation •  Optimization

Horizon 2020 Marie Skłodowska-Curie Actions ITN proposal

(25)

Audio reproduction

• Immersive

sound experience

  reproduction of arbitrary sound fields

  limiting number of loudspeakers

• Healthy

sound experience

  “binge listening”, earphones misuse

  limiting sound pressure levels without

compromising sound experience

• Multi-disciplinary

approach

  psychoacoustics

  electro- & room acoustics

  numerical optimization

Acoustic signal

(26)