Euclidean distance matrix completion for ad-hoc microphone array calibration

This paper addresses the application of missing data recovery via matrix completion for audio sensor networks. We propose a method based on Euclidean distance matrix completion for ad-hoc microphone array location calibration. This method can calibrate a full network from partial connectivity information. The pairwise distances of microphones in close proximity are estimated using the coherence model of the diffuse noise field. The distance matrix of the ad-hoc network is constructed where the distances of the microphones above a threshold are missing. We exploit the low-rank property of the squared distance matrix and apply a matrix completion method to recover the missing entries. In order to constrain the Euclidean space geometry, we propose the additional use of the Cadzow algorithm for matrix completion. The applicability of the proposed method is evaluated on real data recordings where a significant improvement over the state-of-the-art is achieved.

[1]  James A. Cadzow,et al.  Signal enhancement-a composite property mapping algorithm , 1988, IEEE Trans. Acoust. Speech Signal Process..

[2]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[3]  Harvey F. Silverman,et al.  Microphone position and gain calibration for a large-aperture microphone array , 2005, IEEE Transactions on Speech and Audio Processing.

[4]  H. Soltanian-Zadeh,et al.  Single channel speech separation with a frame-based pitch range estimation method in modulation frequency , 2010, 2010 5th International Symposium on Telecommunications.

[5]  Volkan Cevher,et al.  Structured sparse coding for microphone array location calibration , 2012, SAPA@INTERSPEECH.

[6]  Afsaneh Asaei,et al.  An integrated framework for multi-channel multi-source localization and voice activity detection , 2011, 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays.

[7]  Petros Drineas,et al.  Distance Matrix Reconstruction from Incomplete Distance Information for Sensor Network Localization , 2006, 2006 3rd Annual IEEE Communications Society on Sensor and Ad Hoc Communications and Networks.

[8]  Volkan Cevher,et al.  Model-based compressive sensing for multi-party distant speech recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Farbod Razzazi,et al.  A fast compressive sensing approach for phoneme classification , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[11]  Hervé Bourlard,et al.  Sparse component analysis for speech recognition in multi-speaker environment , 2010, INTERSPEECH.

[12]  Yoav Seginer,et al.  The Expected Norm of Random Matrices , 2000, Combinatorics, Probability and Computing.

[13]  Mohammed Ghanbari,et al.  Verified speaker localization utilizing voicing level in split-bands , 2009, Signal Process..

[14]  Volkan Cevher,et al.  Computational methods for structured sparse component analysis of convolutive speech mixtures , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Ying Zhang,et al.  Localization from mere connectivity , 2003, MobiHoc '03.

[16]  Volkan Cevher,et al.  Model-based sparse component analysis for reverberant speech localization , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Philip N. Garner,et al.  Combining cepstral normalization and cochlear implant-like speech processing for microphone array-based speech recognition , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[18]  Ying Zhang,et al.  Localization from connectivity in sensor networks , 2004, IEEE Transactions on Parallel and Distributed Systems.

[19]  Hervé Bourlard,et al.  BROADBAND BEAMPATTERN FOR MULTI-CHANNEL SPEECH ACQUISITION AND DISTANT SPEECH RECOGNITION , 2011 .

[20]  Ivan Himawan,et al.  Microphone Array Shape Calibration in Diffuse Noise Fields , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  Emmanuel J. Candès,et al.  Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[22]  Rainer Lienhart,et al.  Position calibration of microphones and loudspeakers in distributed computing platforms , 2005, IEEE Transactions on Speech and Audio Processing.

[23]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[24]  Kristine L. Bell,et al.  Array self calibration with large sensor position errors , 1999, Conference Record of the Thirty-Third Asilomar Conference on Signals, Systems, and Computers (Cat. No.CH37020).

[25]  Rafaely Spatial-temporal correlation of a diffuse sound field , 2000, The Journal of the Acoustical Society of America.

[26]  Sewoong Oh,et al.  A Gradient Descent Algorithm on the Grassman Manifold for Matrix Completion , 2009, ArXiv.

[27]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.

[28]  Volkan Cevher,et al.  Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings , 2012, ArXiv.

[29]  Andrea Montanari,et al.  Matrix Completion from Noisy Entries , 2009, J. Mach. Learn. Res..

[30]  Volkan Cevher,et al.  Structured Sparsity Models for Reverberant Speech Separation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[31]  Volkan Cevher,et al.  Multi-Party Speech Recovery Exploiting Structured Sparsity Models , 2011, INTERSPEECH.

[32]  E. Habets,et al.  Generating sensor signals in isotropic noise fields. , 2007, The Journal of the Acoustical Society of America.

[33]  Minghua Chen,et al.  Energy-Based Position Estimation of Microphones and Speakers for Ad Hoc Microphone Arrays , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.