printlogo
http://www.ethz.ch/index_EN
Machine Learning Laboratory
 
print
  

Auditory Scene Analysis

Awards

Kay H. Brodersen received a Trainee Abstract Award at HBM 2012 for his work on 'Model-Based Clustering Using Generative Embedding.' Kay will be giving a talk on his results in Beijing on 12 June.

François Cellier received the McLeod Founder's Award of the Society for Modeling and Simulation International.

Spotlight

spotlight



Computational models of brain connectivity, coupled with machine learning algorithms, make it possible to infer neuronal disease mechanisms from non-invasive functional magnetic resonance imaging (fMRI) data in humans. This illustration shows how dynamic systems models can be used for reducing complex (high-dimensional) brain activity data to a simple (low-dimensional) and mechanistically interpretable representation (Brodersen et al., PLoS Comput. Biol. 2011). Please also see the summary on ETH Life.

Introduction

The human auditory system selects relevant sounds from noise and irrelevant acoustic input. For hearing impaired persons, this ability is often significantly reduced. Furthermore, resolution in time and frequency is degraded, which makes it difficult to accurately locate a source. In collaboration with Phonak AG, we develop methods to analyze acoustic scenes and hearing instrument wearers' needs, with the goal of optimal adaptive control of the hearing instrument. Our current research focuses on hierarchical classification, component analysis, unsupervised and semi-supervised online learning, and model based signal processing.  

Projects

Component analysis to enhance speech in noisy environments

The speech we hear and the facial movements we see have a high degree of statistical dependence. Therefore, a person can better follow a conversation if she sees her discussion partner's face. We design algorithms which exploit this dependency to enhance speech in difficult listening environments.    
Two observations of the same source: A sequence of video frames of a speaker's face, and the distribution of spectral energy of the speech over time
Two observations of the same source: A sequence of video frames of a speaker's face, and the distribution of spectral energy of the speech over time.
   
Analysis procedures such as PCA reveal effects which are present in a data set. They also provide explanations of these effects in terms of linear combinations of influence factors. In order to interpret them, sparse explanations which depend only on few influence factors are preferred. We develop algorithms which can handle large data sets with millions of samples and influence factors efficiently.    
A three-dimensional data set showing a superposition of two effects
A three-dimensional data set showing a superposition of two effects. One is well explained by the influence factors plotted horizontally, the other by the single influence factor plotted vertically.
   
Contact: Tomas Dikk, Christian Sigg
   
Collaborators:
* Phonak AG, Dr. Launer 
   
Funding source: Phonak AG, KTI
   
 

Wichtiger Hinweis:
Diese Website wird in älteren Versionen von Netscape ohne graphische Elemente dargestellt. Die Funktionalität der Website ist aber trotzdem gewährleistet. Wenn Sie diese Website regelmässig benutzen, empfehlen wir Ihnen, auf Ihrem Computer einen aktuellen Browser zu installieren. Weitere Informationen finden Sie auf
folgender Seite.

Important Note:
The content in this site is accessible to any browser or Internet device, however, some graphics will display correctly only in the newer versions of Netscape. To get the most out of our site we suggest you upgrade to a newer browser.
More information

© 2012 ETH Zurich | Imprint | Disclaimer | 16 November 2011
top