Making Sense of Sound
ESAT Inaugural Lecture 2014
Prof. dr. ir. Toon van Waterschoot
Faculty of Engineering Technology
ESAT – Department of Electrical Engineering
Academic Career
1979: born in Lier, Belgium
1987: started playing music & being fascinated by sound 1996: decided to study Electrical Engineering
(for specializing in audio signal processing) 2001: graduated as Master in Electrical Engineering 2002: started teaching at Antwerp Maritime Academy 2003: started PhD at KU Leuven (Prof. Marc Moonen) 2009: graduated as PhD in Electrical Engineering
2010: moved to TU Delft as postdoc (Prof. Geert Leus) 2011: returned to KU Leuven as FWO postdoc
(Prof. Marc Moonen & Prof. Moritz Diehl)
Affiliations
• KU Leuven, Belgium: Assistant Professor
STADIUS Center for Dynamical Systems,
Signal Processing, and Data Analytics
AdvISe – Advanced Integrated Sensing Lab
(KU Leuven Campus Geel)
• TU Delft, The Netherlands: Visiting Researcher
Circuits & Systems Group
• University of Lugano, Switzerland: Visiting Lecturer
Teaching
• KU Leuven, Faculty of Engineering Technology
Control Theory
Digital Signal Processing
Databases & Web Technology (2014-2015)
• KU Leuven, Faculty of Engineering Science
Optimization (2014-2015)
• University of Lugano, Faculty of Informatics
Digital Signal Processing
Research Projects
• FWO Postdoctoral Research Fellowship
A Signal-Oriented Approach to Acoustic Signal Enhancement
• FP7 Marie Curie Initial Training Network “DREAMS”
Dereverberation & Reverberation of Audio, Music, and Speech
• IWT O&O Project “Cochlear-IV”
Signal Transmission Schemes for Auditory Implants
• KU Leuven Programme Financing “OPTEC”
Optimization in Engineering Center
• IWT Strategic Basic Research Project “ALADIN”
Adaptation & Learning for Assistive Domestic Vocal Interfaces
• IWT Strategic Basic Research Project “SINS”
Sound Interfacing Through the Swarm
Dutch title Gecomprimeerde bemonstering voor draadloze communicatie
English title Compressive Sampling for Wireless Communications GENERAL
Title of your PhD dissertation
Scientific field
Date of the application
Science and Technology
Informatics and Knowledge Technology (W&T5) FWO Expert Panel
The proposed research project deals with the power spectrum cartography for cognitive radios based on finite element method. The project is related to a number of major research fields but more importantly to ``Telecom and computer networks” and ``Parallel and distributed computing”, which lie in the scope of expert panel W&T5.
Motivation of panel choice
Pegasus - Short
Dutch title Eindige-elementen-gebaseerde vermogenspectrumcartografie voor cognitieve radiocommunicatie
English title Finite element based power spectrum cartography for cognitive radios
Title of your research proposal
1
Research Team
Enzo De Sena Postdoc DREAMS Giuliano Bernardi PhD Student Cochlear-IV Giacomo Vairetti PhD Student DREAMS Niccoló Antonello PhD Student DREAMS Gert Dekkers PhD Student ALADIN/SINS Mina Shehata PhD Student ALADIN/SINS Pablo Peso PhD Student Nuance/Imperial DREAMS Ante Jukić PhD Student Univ. of Oldenburg DREAMS Neo Kaplanis PhD Student B&O/Aalborg Univ. DREAMSSound and Electrical Engineering
• Sound = pressure wave propagating
in medium (air)
• Pressure wave can be converted
into electrical voltage and vice versa
microphone
loudspeaker
• Audio signal processing:
electrical circuit
sound signal electrical signal electrical signal sound signal
longitudinal sound wave
so un d pre ssu re © In st itu re fo r So un d an d Vi bra tio n R ese arch , So ut ha mp to n U ni ve rsi ty position
analog electrical
signal
•
Digital audio signal processing:
analog-to-digital (A/D) conversion: sampling + quantization
algorithm = analysis/manipulation of sound by means of
mathematical operations (+−×÷) & matrix algebra analog
electrical signal
Digital Signal Processing
elektrische schakeling digital electrical circuit A/D D/A sound signal digital electrical signal amp lit ud e continuous time 0 1 2 3 x(3) x(2) x(1) … … x = 2 6 6 6 4 x(1) x(2) x(3) .. . 3 7 7 7 5 discrete time digital electrical signal sound signal
Challenges in Digital Audio Processing
•
Modeling:
sound = natural/physical phenomenon
no exact/unique signal models available
•
Processing:
sound = broadband signal
high sampling frequency
high computational processing load
•
Evaluation:
what really matters is: “How does it sound?”
subjective measure that is hard to objectify
listening tests: time-consuming & expensive
© Diego Ba nu el os © IEL T S
Audio research is by definition multi-disciplinary research!
Signal
processing
Machine
Learning
Room
Acoustics
Psycho-acoustics
Numerical
Optimization
Making Sense
of Sound
Making Sense of Sound
•
How to improve the perceived quality of sound?
Audio signal enhancement
•
How to model the interaction between sound
and environment?
Acoustic modeling
•
How to extract information from sound?
Audio analysis
•
How to create an immersive and healthy sound
experience?
Audio reproduction PAST PRESENT FUTURE Acoustic signalAcoustic feedback control
•
Acoustic feedback control:
acoustic feedback from loudspeaker to microphone howling
adaptive feedback canceller estimates acoustic room model
& removes acoustic feedback without distorting source signal
structuring 5 decades of acoustic feedback research
Acoustic signal
enhancement modeling Acoustic analysis Audio reproduction Audio
room room
model
+ −
© Schulich School of Music, McGill University
Acoustic echo cancellation
•
Acoustic echo cancellation
echo removal by adaptive estimation of acoustic room model
increased double-talk robustness (when both speakers active)
joint estimation of room model & nonlinear loudspeaker model
Acoustic signal
Clipping & loudspeaker compensation
•
Clipping & nonlinear loudspeaker distortion
analog audio equipment (amplifiers, loudspeakers) as well as
digital signal representations have limited dynamic range
hard or soft clipping precompensation signal distortion
perceptually optimal clipping & loudspeaker precompensation
Acoustic signal
enhancement modeling Acoustic analysis Audio reproduction Audio
D/A perceptual audio signal model numerical optimization algorithm
Dereverberation
•
Multi-channel blind dereverberation
far-talk microphones record reverberant instead of “dry” sound
dereverberation = removing effect of room acoustics
joint source & channel estimation = underdetermined problem
prior knowledge: speech model, room decay, …
Acoustic signal
enhancement modeling Acoustic analysis Audio reproduction Audio
joint source & channel estimation prior knowledge room model
•
Room impulse response (RIR)
•
Problems
room impulse responses are very long (103–104 coefficients)
truncation results in poor spectral accuracy
high spatial variability
inefficient representation of perceptual attributes
Room modeling
Acoustic signal
•
Room modeling using orthonormal basis functions (OBFs)
RIR: linear combination of many unit impulse basis functions
OBF model: linear combination of few “long” basis functions
•
Scalable OBF model building algorithm:
dictionary of OBFs from “overcomplete” pole grid
select poles using sparsity-promoting criterion
Room modeling: Orthonormal basis functions
Acoustic signal
•
Room modeling using numerical acoustics
modeling of complete space-time acoustic wave field
space-time sampling acoustic wave equation (FEM/FDTD)
•
Wave field sensing
use microphone measurements in numerical model
source & field estimation using sparse optimization
Room modeling: Numerical acoustics
Acoustic signal
enhancement modeling Acoustic analysis Audio reproduction Audio
The Finite Difference Time Domain Method
Second order partial derivatives are approximated as:
Sound pressure and source signals are sampled uniformly: Wave equation in 2D:
Room Acoustics: Wave Equation
where: Wave equation Boundary conditions: Initial conditions: Inverse problem
Forward problem: is known, is wanted
Inverse problem: is partially known(*), is wanted underdetermined system
Exploit Sparsity of !
(*) Microphone measurements in the room
Assumptions:
● Geometry of the room is known
● Boundary conditions are known
•
Room modeling using image method (1979)
room boundary reflections modeled by image sources
•
Perceptual performance of image method
despite 2000+ citations, perceptual evaluation is lacking
time alignment of image sources “flanging” effect
solution: small random displacement of image sources
Room modeling: Image method
Acoustic signal
enhancement modeling Acoustic analysis Audio reproduction Audio
5 10 15 20 Time [s] Frequency [kHz] 0 0.2 0.4 0.6 0.8 1 0 -50 -100 Magnitude [dB]
•
Perceptual attributes of room acoustics
broad literature on concert halls, few results for small rooms
•
Perceptual models for small room acoustics
identification & classification of perceptual attributes
measurement & simulation of small room acoustics
subjective listening tests link physics and perception
Room modeling: Perceptual models
Acoustic signal
•
Ear modeling
acoustic/mechanic sound propagation from inner to outer ear
cause of acoustic feedback in acoustic hearing implants
•
How to characterize inner-to-outer-ear response?
in-vitro measurements on cadaver heads
Ear modeling
Acoustic signal
enhancement modeling Acoustic analysis Audio reproduction Audio
3 – CochlearR device 18/31 CochlearR Codacs Actuator Fixation system Implant body
AdvISe ProjectClub – Giuliano Bernardi
3 – Setup 20/31
AdvISe ProjectClub – Giuliano Bernardi
3 – Setup 20/31
AdvISe ProjectClub – Giuliano Bernardi
3 – CochlearR device 18/31 CochlearR Codacs Actuator Fixation system Implant body
MS. AMSTERDAM DECLINED TO FORM
•
Automatic speech recognition (ASR)
applications: speech-to-text, voice control, surveillance, …
standard ASR fails in background noise & reverberation
back-end solution: C50-based acoustic model selection
front-end solution: noise reduction & dereverberation
Robust speech recognition
Acoustic signal
enhancement modeling Acoustic analysis Audio reproduction Audio
ASR
ASR
LAST YEAR EARLIER THE PLATFORM
C50 estimation
•
Acoustic event detection
acoustic event: change in physical condition of sound source
applications: music transcription, acoustic surveillance, …
•
Event detection using sparse signal representations
two-class dictionaries with steady-state and transient atoms
criteria to search for most efficient (sparse) representation
detection based on relative importance of two classes
Acoustic event detection
Acoustic signal
enhancement modeling Acoustic analysis Audio reproduction Audio
© SST, Inc. © C en tre fo r D ig ita l Mu si c, Q ue en Ma ry U ni ve rsi ty of L on do n © W hy C ry
Model-based audio analysis
Acoustic signal
enhancement modeling Acoustic analysis Audio reproduction Audio
Audio analysis
Applications
• Hearing aids • Music technology • Bio-monitoring
• Personal audio devices • Ambient assisted living • Music recommendation • Acoustic surveillance • Multimedia archiving Audio analysis Problems • Modeling • Estimation • Detection • Classification • Separation • Optimization
Horizon 2020 Marie Skłodowska-Curie Actions ITN proposal
© F ra un ho fe r ID MT © Vi la mu nd o © Ad am Ku tn er
Audio reproduction
•
Immersive
sound experience
reproduction of arbitrary sound fields
limiting number of loudspeakers
•
Healthy
sound experience
“binge listening”, earphones misuse
limiting sound pressure levels without
compromising sound experience
•
Multi-disciplinary
approach
psychoacoustics
electro- & room acoustics
numerical optimization
Acoustic signal