A Speaker Localization System for Lecture Room Environment

This paper presents a speaker localization system, which is an entry to Rich Transcription 2005 Spring Meeting Recognition Evaluation. The system is developed in the Institute of Signal Processing at Tampere University of Technology (TUT). The paper describes the framework of the evaluation and the proposed localization system. This paper is an extension to [1] giving the actual performance values of the system. The localization system is based on spatially separate sensor stations. The sensor stations estimate Direction of Arrival (DOA) of acoustic wavefronts. Each sensor station produces a three dimensional DOA vector. The estimated DOA vectors at each time instant are combined to calculate the location of the sound sound source. The performance of the system was determined using a set of predefined metrics. Using multiple metrics enables one to evaluate the performance of the localization system from different viewpoints. The overall performance is characterized by RMS error between estimates and reference positions. The results show that the performance of the proposed system is consistent and accuracy is satisfactory for meeting room scenario. However, several improvements can be seen.

[1]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[2]  Tomohiro Nakatani,et al.  Harmonic sound stream segregation using localization and its application to speech stream segregation , 1999, Speech Commun..

[3]  K. Kalliojarvi,et al.  Low-complexity angle of arrival estimation of wideband signals using small arrays , 1996, Proceedings of 8th Workshop on Statistical Signal and Array Processing.

[4]  A. Bregman Auditory Scene Analysis , 2008 .

[5]  A. Visa,et al.  A spatiotemporal approach to passive sound source localization , 2004, IEEE International Symposium on Communications and Information Technology, 2004. ISCIT 2004..

[6]  Arye Nehorai,et al.  Wideband source localization using a distributed acoustic vector-sensor array , 2003, IEEE Trans. Signal Process..

[7]  Avinash C. Kak,et al.  Array signal processing , 1985 .

[8]  Tuomo W. Pirinen,et al.  Normalized confidence factors for robust direction of arrival estimation , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[9]  Péter Molnár,et al.  Maximum likelihood methods for bearings-only target localization , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).