Simultaneous asynchronous microphone array calibration and sound source localisation

In this paper, an approach for sound source localisation and calibration of an asynchronous microphone array is proposed to be solved simultaneously. A graph-based Simultaneous Localisation and Mapping (SLAM) method is used for this purpose. Traditional sound source localisation using a microphone array has two main requirements. Firstly, geometrical information of microphone array is needed. Secondly, a multichannel analog-to-digital converter is required to obtain synchronous readings of the audio signal. Recent works aim at releasing these two requirements by estimating the time offset between each pair of microphones. However, it was assumed that the clock timing in each microphone sound card is exactly the same, which requires the clocks in the sound cards to be identically manufactured. A methodology is hereby proposed to calibrate an asynchronous microphone array using a graph-based optimisation method borrowed from the SLAM literature, effectively estimating the array geometry, time offset and clock difference/drift rate of each microphone together with the sound source locations. Simulation and experimental results are presented, which prove the effectiveness of the proposed methodology in achieving accurate estimates of the microphone array characteristics needed to be used on realistic settings with asynchronous sound devices.

[1]  H. Howard Fan,et al.  Asynchronous Differential TDOA for Sensor Self-Localization , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[2]  Hiroshi G. Okuno,et al.  A robot referee for rock-paper-scissors sound games , 2008, 2008 IEEE International Conference on Robotics and Automation.

[3]  Wolfram Burgard,et al.  A Tutorial on Graph-Based SLAM , 2010, IEEE Intelligent Transportation Systems Magazine.

[4]  Jean Rouat,et al.  Enhanced Robot Speech Recognition Based on Microphone Array Source Separation and Missing Feature Theory , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[5]  Mikael Mieskolainen,et al.  Closed-form self-localization of asynchronous microphone arrays , 2011, 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays.

[6]  Keisuke Nakamura,et al.  SLAM-based online calibration of asynchronous microphone array for robot audition , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Jean Rouat,et al.  Enhanced robot audition based on microphone array source separation with post-filter , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[8]  Keisuke Nakamura,et al.  Intelligent Sound Source Localization and its application to multimodal human tracking , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Gernot A. Fink,et al.  Towards acoustic self-localization of ad hoc smartphone arrays , 2011, 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays.

[10]  Augusto Sarti,et al.  Acoustic Source Localization With Distributed Asynchronous Microphone Networks , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Shigeki Sagayama,et al.  Blind Estimation of Locations and Time Offsets for Distributed Recording Devices , 2010, LVA/ICA.

[12]  V. Michael Bove,et al.  Audio-Based Self-Localization for Ubiquitous Sensor Networks , 2005 .

[13]  S. R. Mahadeva Prasanna,et al.  Speaker localization using excitation source information in speech , 2005, IEEE Transactions on Speech and Audio Processing.