Discriminative Binaural Sound Localization

Time difference of arrival (TDOA) is commonly used to estimate the azimuth of a source in a microphone array. The most common methods to estimate TDOA are based on finding extrema in generalized cross-correlation waveforms. In this paper we apply microphone array techniques to a manikin head. By considering the entire cross-correlation waveform we achieve azimuth prediction accuracy that exceeds extrema locating methods. We do so by quantizing the azimuthal angle and treating the prediction problem as a multiclass categorization task. We demonstrate the merits of our approach by evaluating the various approaches on Sony's AIBO robot.

[1]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[2]  David G. Stork,et al.  Pattern Classification , 1973 .

[3]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[4]  Bernard Widrow,et al.  Adaptive switching circuits , 1988 .

[5]  Maurizio Omologo,et al.  Acoustic event localization using a crosspower-spectrum phase based technique , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Boaz Porat,et al.  A course in digital signal processing , 1996 .

[7]  Norbert Strobel,et al.  Classification of time delay estimates for robust speaker localization , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[8]  B. Rao,et al.  Source Localization in Reverberant Environments: Part Ii-statistical Analysis Sp-edics: 2-room , 2000 .

[9]  Benesty,et al.  Adaptive eigenvalue decomposition algorithm for passive acoustic source localization , 2000, The Journal of the Acoustical Society of America.

[10]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[11]  Mohan M. Trivedi,et al.  Source localization in reverberant environments: modeling and statistical analysis , 2003, IEEE Trans. Speech Audio Process..

[12]  Y. Singer,et al.  Ultraconservative online algorithms for multiclass problems , 2003 .