Sound source localization based on discrimination of cross-correlation functions

Abstract Sound source localization plays a crucial role in many microphone arrays application, ranging from speech enhancement to human–computer interface in a reverberant noisy environment. The steered response power (SRP) using the phase transform (SRP-PHAT) method is one of the most popular modern localization algorithms. The SRP-based source localizers have been proved robust, however, the methods may fail to locate the sound source in adverse noise and reverberation conditions, especially when the direct paths to the microphones are unavailable. This paper proposes a localization algorithm based on discrimination of cross-correlation functions. The cross-correlation functions are calculated by the generalized cross-correlation phase transform (GCC-PHAT) method. Using cross-correlation functions, sound source location is estimated by one of the two classifiers: Naive-Bayes classifier and Euclidean distance classifier. Simulation results have demonstrated that the proposed algorithms provide higher localization accuracy than the SRP-PHAT algorithm in reverberant noisy environment.

[1]  Javier Ruiz-del-Solar,et al.  Recognition of Faces in Unconstrained Environments: A Comparative Study , 2009, EURASIP J. Adv. Signal Process..

[2]  Jacob Benesty,et al.  A Generalized Steered Response Power Method for Computationally Viable Source Localization , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Michael S. Brandstein,et al.  Microphone Arrays - Signal Processing Techniques and Applications , 2001, Microphone Arrays.

[4]  Zhengyou Zhang,et al.  Maximum Likelihood Sound Source Localization and Beamforming for Directional Microphone Arrays in Distributed Meetings , 2008, IEEE Transactions on Multimedia.

[5]  Tetsuya Takiguchi,et al.  Single-Channel Talker Localization Based on Discrimination of Acoustic Transfer Functions , 2009, EURASIP J. Adv. Signal Process..

[6]  Sergios Theodoridis,et al.  A Novel Efficient Cluster-Based MLSE Equalizer for Satellite Communication Channels with-QAM Signaling , 2006, EURASIP J. Adv. Signal Process..

[7]  Xudong Ma,et al.  Robust tracking of moving sound source using scaled unscented particle filter , 2008 .

[8]  M. Omologo,et al.  Comparison Between Different Sound Source Localization Techniques Based on a Real Data Collection , 2008, 2008 Hands-Free Speech Communication and Microphone Arrays.

[9]  Jacob Benesty,et al.  Time Delay Estimation in Room Acoustic Environments: An Overview , 2006, EURASIP J. Adv. Signal Process..

[10]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[11]  Zhenyang Wu,et al.  Improved steered response power method for sound source localization based on principal eigenvector , 2010 .

[12]  Walter Kellermann,et al.  Multidimensional Localization of Multiple Sound Sources Using Blind Adaptive MIMO System Identification , 2006, 2006 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems.

[13]  Jacob Benesty,et al.  Steered Beamforming Approaches for Acoustic Source Localization , 2010 .

[14]  Benesty Adaptive eigenvalue decomposition algorithm for passive acoustic source localization , 2000, The Journal of the Acoustical Society of America.

[15]  Parham Aarabi,et al.  Enhanced sound localization , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[17]  Alessio Brutti,et al.  Classification of Acoustic Maps to Determine Speaker Position and Orientation from a Distributed Microphone Network , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[18]  Michael S. Brandstein,et al.  Robust Localization in Reverberant Rooms , 2001, Microphone Arrays.

[19]  K. C. Ho,et al.  An Accurate Algebraic Closed-Form Solution for Energy-Based Source Localization , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  Jacob Benesty,et al.  Speech Enhancement , 2010 .

[21]  Norbert Strobel,et al.  Classification of time delay estimates for robust speaker localization , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[22]  Darren B. Ward,et al.  Particle filtering algorithms for tracking an acoustic source in a reverberant environment , 2003, IEEE Trans. Speech Audio Process..

[23]  Augusto Sarti,et al.  Resource constrained efficient acoustic source localization and tracking using a distributed network of microphones , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[24]  Tetsuya Takiguchi,et al.  HMM-based separation of acoustic transfer function for single-channel sound source localization , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[25]  Luca Giulio Brayda,et al.  Speech Recognition in Reverberant Environments Using Remote Microphones , 2006, Eighth IEEE International Symposium on Multimedia (ISM'06).

[26]  Mohan M. Trivedi,et al.  Audiovisual Information Fusion in Human–Computer Interfaces and Intelligent Environments: A Survey , 2010, Proceedings of the IEEE.

[27]  Heinrich Kuttruff,et al.  Room acoustics , 1973 .