Diffuse noise suppression with asynchronous microphone array based on amplitude additivity model

In this paper, we propose a method for suppressing a large number of interferences by using multichannel amplitude analysis based on nonnegative matrix factorization (NMF) and its effective semi-supervised training. For the point-source interference reduction of an asynchronous microphone array, we propose amplitude-based speech enhancement in the time-channel domain, which we call transfer-function-gain NMF. Transfer-function-gain NMF is a robust method against drift, which disrupts an inter-channel phase analysis. We use this method to suppress a large number of sources. We show that a mass of interferences can be modeled by a single basis assuming that the noise sources are sufficiently far from the microphones and the spatial characteristics become similar to each other. Since the blind optimization of the NMF parameters does not work well with merely sparse observation contaminated by the constant heavy noise, we train the diffuse noise basis in advance of the noise suppression using a speech absent observation, which can be obtained easily using a simple voice activity detection technique. We confirmed the effectiveness of our proposed model and semi-supervised transfer-function-gain NMF in an experiment simulating a target source that was surrounded by a diffuse noise.

[1]  Shoji Makino,et al.  Blind compensation of interchannel sampling frequency mismatch for ad hoc microphone array based on maximum likelihood estimation , 2015, Signal Process..

[2]  Takeshi Yamada,et al.  Amplitude-based speech enhancement with nonnegative matrix factorization for asynchronous distributed recording , 2014, 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC).

[3]  Takeshi Yamada,et al.  Speech enhancement with ad-hoc microphone array using single source activity , 2013, 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference.

[4]  Shoji Makino,et al.  Blind compensation of inter-channel sampling frequency mismatch with maximum likelihood estimation in STFT domain , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  水野 賀文,et al.  Selection of the number of basis vectors in non-negative matrix factorization for music transcription , 2012 .

[6]  Tomohiro Nakatani,et al.  Distributed microphone array processing for speech source separation with classifier fusion , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.

[7]  Alexander Bertrand,et al.  Applications and trends in wireless acoustic sensor networks: A signal processing perspective , 2011, 2011 18th IEEE Symposium on Communications and Vehicular Technology in the Benelux (SCVT).

[8]  Nobutaka Ito,et al.  Diffuse Noise Suppression Using Crystal-Shaped Microphone Arrays , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Nancy Bertin,et al.  Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[10]  Zicheng Liu SOUND SOURCE SEPARATION WITH DISTRIBUTED MICROPHONE ARRAYS IN THE PRESENCE OF CLOCK SYNCHRONIZATION ERRORS , 2008 .

[11]  Ted S. Wada,et al.  On Dealing with Sampling Rate Mismatches in Blind Source Separation and Acoustic Echo Cancellation , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[12]  Bhiksha Raj,et al.  Supervised and Semi-supervised Separation of Sounds from Single-Channel Mixtures , 2007, ICA.

[13]  Tuomas Virtanen,et al.  Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Hervé Bourlard,et al.  Microphone array post-filter based on noise field coherence , 2003, IEEE Trans. Speech Audio Process..

[16]  P. Smaragdis,et al.  Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).