Bayesian Extension of MUSIC for Sound Source Localization and Tracking

This paper presents a Bayesian extension of MUSIC-based sound source localization (SSL) and tracking method. SSL is important for distant speech enhancement and simultaneous speech separation for improving speech recognition, as well as for auditory scene analysis by mobile robots. One of the drawbacks of existing SSL methods is the necessity of careful parameter tunings, e.g., the sound source detection threshold depending on the reverberation time and the number of sources. Our contribution consists of (1) automatic parameter estimation in the variational Bayesian framework and (2) tracking of sound sources with reliability. Experimental results demonstrate our method robustly tracks multiple sound sources in a reverberant environment with RT20 = 840 (ms). Index Terms: simultaneous sound source localization, MUSIC algorithm, variational Bayes, particle filter

[1]  Satoshi Kagami,et al.  Map-generation and identification of multiple sound sources from robot in motion , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[3]  Arun Ross,et al.  Microphone Arrays , 2009, Encyclopedia of Biometrics.

[4]  Tetsuya Ogata,et al.  Design and Implementation of 3D Auditory Scene Visualizer towards Auditory Awareness with Face Tracking , 2008, 2008 Tenth IEEE International Symposium on Multimedia.

[5]  Hiroshi G. Okuno,et al.  Design and implementation of selectable sound separation on the Texai telepresence system using HARK , 2011, 2011 IEEE International Conference on Robotics and Automation.

[6]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[7]  Marc Moonen,et al.  GSVD-Based Optimal Filtering for Multi-Microphone Speech Enhancement , 2001, Microphone Arrays.

[8]  R. O. Schmidt,et al.  Multiple emitter location and signal Parameter estimation , 1986 .

[9]  Patrick Danès,et al.  Information-theoretic detection of broadband sources in a coherent beamspace MUSIC scheme , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Takeshi Yamada,et al.  Detection of Overlapping Speech in Meetings Using Support Vector Machines and Support Vector Regression , 2006, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[11]  Masataka Goto,et al.  Real-time sound source localization and separation system and its application to automatic speech recognition , 2001, INTERSPEECH.