Speaker localization using stereo-based sound source localization

Our present paper deals with the problem of audio speaker localization, which consists in determining the position of the active speaker in a meeting-room. This step represents the first task of speaker tracking, which is the global purpose of our research work. In this approach, two-channel-based (stereo) estimation of the speaker position is achieved by comparing the signals received by two cardioids microphones that are placed the one against the other and separated by a fixed distance. Our localization technique is inspired from the human ears, which act as two different sound observation points, enabling humans to estimate the direction of the speaking person with a good precision. The off-line experiments of speaker tracking have been done in a small meeting room without echo cancelation. Results show the good performances of the proposed localization methods.

[1]  Henry Cox,et al.  Robust adaptive beamforming , 2005, IEEE Trans. Acoust. Speech Signal Process..

[2]  Enzo Mumolo,et al.  Algorithms for acoustic localization based on microphone array in service robotics , 2003, Robotics Auton. Syst..

[3]  Hyogon Kim,et al.  Speaker localization using the TDOA-based feature matrix for a humanoid robot , 2008, RO-MAN 2008 - The 17th IEEE International Symposium on Robot and Human Interactive Communication.

[4]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[5]  Daniel Gatica-Perez,et al.  Speaker localization for microphone array-based ASR: the effects of accuracy on overlapping speech , 2006, ICMI '06.

[6]  Halim Sayoud,et al.  Speaker Discrimination on Broadcast News and Telephonic Calls Using a Fusion of Neural and Statistical Classifiers , 2009, Int. J. Mob. Comput. Multim. Commun..

[7]  Patrick Verlinde Contribution à la vérification multi-modale de l'identité en utilisant la fusion de décisions , 1999 .

[8]  Knud Rasmussen,et al.  Calculation methods for the physical properties of air used in the calibration of microphones , 1997 .

[9]  Jwu-Sheng Hu,et al.  Sound source localization by microphone array on a mobile robot using eigen-structure based generalized cross correlation , 2008, 2008 IEEE Workshop on Advanced robotics and Its Social Impacts.

[10]  Brent Christopher Kirkwood Acoustic Source Localization Using Time-Delay Estimation , 2003 .

[11]  Felix Schaeffler,et al.  A methodological study into the linguistic dimensions of pitch range differences between German and English , 2008, Speech Prosody 2008.

[12]  E. Marchand Positionnement relatif d'une caméra et d'une source lumineuse en utilisant les gradients d'intensité de l'image , 2007 .

[13]  Gerald C. Lauchle Effect of Turbulent Boundary Layer Flow on Measurement of Acoustic Pressure and Intensity , 1984 .

[14]  Guillaume Lathoud,et al.  Spatio-Temporal Analysis of Spontaneous Speech with Microphone Arrays , 2006 .

[15]  Arun Ross,et al.  An introduction to biometric recognition , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[16]  Yannis Stylianou GMM-based multimodal biometric verification , 2005 .

[17]  Hyogon Kim,et al.  Speaker Localization on a Humanoid Robot ’ s Head using the TDOA-based Feature Matrix , 2008 .

[18]  Ashok Kumar Tellakula Acoustic Source Localization Using Time Delay Estimation , 2007 .

[19]  Anoop Gupta,et al.  Automating camera management for lecture room environments , 2001, CHI.

[20]  S. Wermter,et al.  Robotic Sound-Source Localization and Tracking Using Interaural Time Difference and Cross-Correlation , 2004 .

[21]  Daniel R. Raichel The science and applications of acoustics , 2000 .