Microphone Clustering and BP Network based Acoustic Source Localization in Distributed Microphone Arrays

Abstract —A microphone clustering and back propagation (BP) neural network based acoustic source localization method using distributed microphone arrays in an inte lligent meeting room is proposed. In the propos ed method, a novel clustering method is first used to divide all microphones into several clusters where each one corresponds to a specified BP network. Afterwards, the energy-based cluster selecting scheme is applied to the select the clusters which are small and close to the acoustic source. In each chosen cluster, the time difference of arrival of each microphone pair is estimated, and then all estimated time delays act as input of the corresponding BP network for position estimation. Finally, all estimated positions from the chosen clusters are fused for global position estimation. Only subsets rather than all the microphones are responsible for acoustic source localization, which leads to less computational cost; moreover, the local estimation in each chosen cluster can be processed in parallel, which expects to improve the localization speed potentially. Simulation results from comparison with other related localization methods confirm the validity of the proposed method. Index Terms—acoustic source localization, BP neural network, microphone clustering, GCC-PHAT, TDOA.

[1]  Sridha Sridharan,et al.  Clustered Blind Beamforming From Ad-Hoc Microphone Arrays , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Augusto Sarti,et al.  Resource constrained efficient acoustic source localization and tracking using a distributed network of microphones , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Seiichi Nakagawa,et al.  Automatic estimation of position and orientation of an acoustic source by a microphone array network. , 2009, The Journal of the Acoustical Society of America.

[4]  Parham Aarabi,et al.  EURASIP Journal on Applied Signal Processing 2003:4, 338–347 c ○ 2003 Hindawi Publishing Corporation The Fusion of Distributed Microphone Arrays for Sound Localization , 2002 .

[5]  Jacob Benesty,et al.  Time Delay Estimation via Minimum Entropy , 2007, IEEE Signal Processing Letters.

[6]  Jwu-Sheng Hu,et al.  Estimation of sound source number and directions under a multi-source environment , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Young-Koo Lee,et al.  Clustering based multi-object positioning system , 2011, The 2011 International Conference on Advanced Technologies for Communications (ATC 2011).

[8]  D. R. Farrier Direction of arrival estimation by subspace methods , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[9]  T. Kailath,et al.  Direction-of-arrival estimation by subspace rotation methods - ESPRIT , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Ehtsham Elahi Sound localization and tracking using distributed microphones fusion: Maximum Likelihood or Maximum A-Posteriori approach? , 2009, 2009 2nd International Conference on Computer, Control and Communication.

[11]  Yang Geng,et al.  Sound-source localization system based on neural network for mobile robots , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[12]  Zhiyong Yu,et al.  Capture, recognition, and visualization of human semantic interactions in meetings , 2010, 2010 IEEE International Conference on Pervasive Computing and Communications (PerCom).

[13]  Yong Rui,et al.  Sound source localization for circular arrays of directional microphones , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[14]  Mingsian R Bai,et al.  Microphone array signal processing with application in three-dimensional spatial hearing. , 2005, The Journal of the Acoustical Society of America.

[15]  E. Lehmann,et al.  Prediction of energy decay in room impulse responses simulated with an image-source model. , 2008, The Journal of the Acoustical Society of America.

[16]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[17]  C. K. Michael Tse,et al.  Minimizing effective energy consumption in multi-cluster sensor networks for source extraction , 2009, IEEE Transactions on Wireless Communications.

[18]  Koji Kugata,et al.  Microphone array network for ubiquitous sound acquisition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  Zhaonian Zhang,et al.  Slow moving vehicles using the microphone arrays in the Hopkins Acoustic Surveillance Unit , 2008, 2008 Argentine School of Micro-Nanoelectronics, Technology and Applications.

[20]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[21]  Maximo Cobos,et al.  A Modified SRP-PHAT Functional for Robust Real-Time Sound Source Localization With Scalable Spatial Sampling , 2011, IEEE Signal Processing Letters.

[22]  Zhengyou Zhang,et al.  Maximum Likelihood Sound Source Localization and Beamforming for Directional Microphone Arrays in Distributed Meetings , 2008, IEEE Transactions on Multimedia.

[23]  Tomohiro Nakatani,et al.  Distributed microphone array processing for speech source separation with classifier fusion , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.

[24]  Seiichi Nakagawa,et al.  Directional Acoustic Source'S Position and Orientation Estimation Approach by a Microphone Array Network , 2009, 2009 IEEE 13th Digital Signal Processing Workshop and 5th IEEE Signal Processing Education Workshop.

[25]  Ming-Syan Chen,et al.  Adaptive Clustering for Multiple Evolving Streams , 2006, IEEE Transactions on Knowledge and Data Engineering.

[26]  Fatma Ayhan Sakarya,et al.  A unified neural-network-based speaker localization technique , 2000, IEEE Trans. Neural Networks Learn. Syst..

[27]  Minghua Chen,et al.  Energy-Based Position Estimation of Microphones and Speakers for Ad Hoc Microphone Arrays , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[28]  Bin Ma,et al.  Speaker Clustering and Cluster Purification Methods for RT07 and RT09 Evaluation Meeting Data , 2012, IEEE Transactions on Audio, Speech, and Language Processing.