printlogo
http://www.ethz.ch/index_EN
Machine Learning Laboratory
 
print
  

Auditory Scene Analysis

Awards

Kay H. Brodersen received a Trainee Abstract Award at HBM 2012 for his work on 'Model-Based Clustering Using Generative Embedding.' Kay will be giving a talk on his results in Beijing on 12 June.

François Cellier received the McLeod Founder's Award of the Society for Modeling and Simulation International.

Spotlight

spotlight



Computational models of brain connectivity, coupled with machine learning algorithms, make it possible to infer neuronal disease mechanisms from non-invasive functional magnetic resonance imaging (fMRI) data in humans. This illustration shows how dynamic systems models can be used for reducing complex (high-dimensional) brain activity data to a simple (low-dimensional) and mechanistically interpretable representation (Brodersen et al., PLoS Comput. Biol. 2011). Please also see the summary on ETH Life.

Introduction

The human auditory system selects relevant sounds from noise and irrelevant acoustic input. For hearing impaired persons, this ability is often significantly reduced. Furthermore, resolution in time and frequency is degraded, which makes it difficult to accurately locate a source. In collaboration with Phonak AG, we develop methods to analyze acoustic scenes and hearing instrument wearers' needs, with the goal of optimal adaptive control of the hearing instrument. Our current research focuses on hierarchical classification, component analysis, unsupervised and semi-supervised online learning, and model based signal processing.    

Projects

Component analysis to enhance speech in noisy environments

The speech we hear and the facial movements we see have a high degree of statistical dependence. Therefore, a person can better follow a conversation if she sees her discussion partner's face. We design algorithms which exploit this dependency to enhance speech in difficult listening environments.    
Two observations of the same source: A sequence of video frames of a speaker's face, and the distribution of spectral energy of the speech over time
Two observations of the same source: A sequence of video frames of a speaker's face, and the distribution of spectral energy of the speech over time.
   
Analysis procedures such as PCA reveal effects which are present in a data set. They also provide explanations of these effects in terms of linear combinations of influence factors. In order to interpret them, sparse explanations which depend only on few influence factors are preferred. We develop algorithms which can handle large data sets with millions of samples and influence factors efficiently.    
A three-dimensional data set showing a superposition of two effects
A three-dimensional data set showing a superposition of two effects. One is well explained by the influence factors plotted horizontally, the other by the single influence factor plotted vertically.
   
Contact: Christian Sigg
   
Collaborators:
   

Generative models for identifying acoustic patterns

Using a generative approach, we aim at adequately modeling the acoustic environment and thus allow the hearing instrument to precisely recognize mixtures of several sources. Theoretic questions involve the convergence of the learning algorithms as well as a comparison with previously published algorithm by means of statistical learning theory.    
as_project_1
   
We counsider an additive-generative model for the classification of multi-labeled data. A data item belonging to classes 1, 2 and 3 is interpreted as the sum of one independent sample of each of these three classes.    
     
as_project_2
   
Varying hearing situations imply different hearing requirements. Sould classification is therefore a crucial step for autonomous hearing instruments.    
Contact: Andreas Streich
   
Collaborators:
   

Online adaptive learning with sparse labels

The input is the continuously incoming acoustic sound field of the hearing instrument (HI) user. The HI processes the input, and using the information from a classification scheme, adjusts the signal before emitting it to the HI user. The classification scheme should be adaptive to the user's preference. We investigate this problem from sparse and potentially biased labels.
   
     
ym_project
   
An online adaptive learning scheme under constraints of sparse user feedback.

   
Contact: Yvonne Moh
   
Collaborators:
   
 

Wichtiger Hinweis:
Diese Website wird in älteren Versionen von Netscape ohne graphische Elemente dargestellt. Die Funktionalität der Website ist aber trotzdem gewährleistet. Wenn Sie diese Website regelmässig benutzen, empfehlen wir Ihnen, auf Ihrem Computer einen aktuellen Browser zu installieren. Weitere Informationen finden Sie auf
folgender Seite.

Important Note:
The content in this site is accessible to any browser or Internet device, however, some graphics will display correctly only in the newer versions of Netscape. To get the most out of our site we suggest you upgrade to a newer browser.
More information

© 2012 ETH Zurich | Imprint | Disclaimer | 29 October 2008
top