Synchronization based on mixture alignment for sound source separation in wireless acoustic sensor networks

Desynchronization degrades the performance of many signal processing algorithms in Wireless Acoustic Sensor Networks. It is mainly caused by the different distances between the source and each node and by the clock phase offset and frequency skew. Classical solutions use clock synchronization protocols and algorithms in the communication layer, but these alternatives do not tackle the lack of synchronization caused by the distances between sources and nodes.In this paper, we present a novel study of the synchronization problem in acoustic sensor networks from a signal processing point of view. First, we propose a theoretical framework that allows us to study the effects of misalignment over any short-time based algorithm, focusing on the requirements of the effective length of the analysis time frame. From this framework, a theoretical synchronization delay is established aimed at reducing the required length of the time frame. Second, two novel alignment methods are developed and are tuned up to reduce the amount of synchronization information required for transmission. The results obtained demonstrate that our proposed methods represent a good solution in terms of performance over the quality of a standard Blind Source Separation algorithm, allowing us to reduce the transmission bandwidth required for synchronization data. HighlightsWe study the effects of misalignment over BSS algorithms in WASN.A theoretical synchronization relaxes the constraint over the time frame length.Two novel alignment methods are inspired in the theoretical synchronization.The proposed methods synchronize the mixtures with a reduced transmission bandwidth.

[1]  Athanasios Mouchtaris,et al.  Localizing multiple audio sources in a wireless acoustic sensor network , 2015, Signal Process..

[2]  Laurent Couvreur,et al.  Blind Model Selection for Automatic Speech Recognition in Reverberant Environments , 2004, J. VLSI Signal Process..

[3]  Alexander Bertrand,et al.  Applications and trends in wireless acoustic sensor networks: A signal processing perspective , 2011, 2011 18th IEEE Symposium on Communications and Vehicular Technology in the Benelux (SCVT).

[4]  Jesper Jensen,et al.  An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  G. Clifford Carter Coherence and time delay estimation : an applied tutorial for research, development, test, and evaluation engineers , 1993 .

[6]  Stelios M. Potirakis,et al.  On the effect of compression on the complexity characteristics of wireless acoustic sensor network signals , 2015, Signal Process..

[7]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[8]  Lucas C. Parra,et al.  Convolutive blind separation of non-stationary sources , 2000, IEEE Trans. Speech Audio Process..

[9]  Giuseppe A. Fabrizio,et al.  Exploiting multipath for blind source separation with sensor arrays , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Scott Rickard,et al.  The DUET Blind Source Separation Algorithm , 2007, Blind Speech Separation.

[11]  Kristian Kroschel,et al.  CONSIDERING THE SECOND PEAK IN THE GCC FUNCTION FOR MULTI-SOURCE TDOA ESTIMATION WITH A MICROPHONE ARRAY , 2003 .

[12]  Michael Zibulevsky,et al.  Underdetermined blind source separation using sparse representations , 2001, Signal Process..

[13]  Michael I. Mandel Binaural model-based source separation and localization , 2010 .

[14]  G. C. Carter,et al.  The smoothed coherence transform , 1973 .

[15]  Shoji Makino,et al.  Blind compensation of interchannel sampling frequency mismatch for ad hoc microphone array based on maximum likelihood estimation , 2015, Signal Process..

[16]  Zhengyou Zhang,et al.  Why does PHAT work well in lownoise, reverberative environments? , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  Reinhold Häb-Umbach,et al.  A combined hardware-software approach for acoustic sensor network synchronization , 2015, Signal Process..

[18]  Shoji Makino,et al.  Optimizing frame analysis with non-integrer shift for sampling mismatch compensation of long recording , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[19]  Xinyu Zhang,et al.  Autodirective audio capturing through a synchronized smartphone array , 2014, MobiSys.

[20]  Christophe Ris,et al.  Model-based blind estimation of reverberation time: application to robust ASR in reverberant environments , 2001, INTERSPEECH.

[21]  Ruairí de Fréin,et al.  The Synchronized Short-Time-Fourier-Transform: Properties and Definitions for Multichannel Source Separation , 2011, IEEE Transactions on Signal Processing.

[22]  Yik-Chung Wu,et al.  Clock Synchronization of Wireless Sensor Networks , 2011, IEEE Signal Processing Magazine.

[23]  Walter Kellermann,et al.  Synchronization of acoustic sensors for distributed ad-hoc audio networks and its use for blind source separation , 2004, IEEE Sixth International Symposium on Multimedia Software Engineering.

[24]  Deborah Estrin,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Fine-grained Network Time Synchronization Using Reference Broadcasts , 2022 .

[25]  Lou Boves,et al.  Feature vector selection to improve ASR robustness in noisy conditions , 2001, INTERSPEECH.

[26]  Zbynek Koldovský,et al.  The 2013 Signal Separation Evaluation Campaign , 2013, MLSP.