Geometrically Constrained Independent Vector Analysis for Directional Speech Enhancement

This paper addresses the multichannel directional speech enhancement problem with geometrically constrained independent vector analysis (GCIVA), where we aim to combine the high separation performance from blind source separation and the capability of directional focus from beamforming. The proposed method exploits geometric constraints composed from the spatial information of sources to guide the target speech to the desired output channel. A convergence-guaranteed parameter estimation algorithm is derived from the framework of auxiliary function-based IVA (AuxIVA) to take advantage of fast convergence, low computational cost, and no step-size tuning. We propose a dual-microphone speech enhancement system based on the proposed method and investigate its effectiveness with objective metrics. The experimental evaluations revealed that the proposed system outperformed the conventional beamforming and the standard AuxIVA in a large margin in terms of source-to-distortion and source-to-interference ratios.

[1]  D. Hunter,et al.  A Tutorial on MM Algorithms , 2004 .

[2]  Junichi Yamagishi,et al.  The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods , 2018, Odyssey.

[3]  Te-Won Lee,et al.  Independent Vector Analysis: An Extension of ICA to Multivariate Components , 2006, ICA.

[4]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[5]  Nobutaka Ono,et al.  Fast Stereo Independent Vector Analysis and its Implementation on Mobile Phone , 2012, IWAENC.

[6]  Atsuo Hiroe,et al.  Solution of Permutation Problem in Frequency Domain ICA, Using Multivariate Probability Density Functions , 2006, ICA.

[7]  Nobutaka Ono,et al.  Stable and fast update rules for independent vector analysis based on auxiliary function technique , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[8]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Shoko Araki,et al.  Geometrically Constrained Independent Component Analysis , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Harry L. Van Trees,et al.  Optimum Array Processing , 2002 .

[11]  Emanuel A. P. Habets,et al.  A Geometrically Constrained Independent Vector Analysis Algorithm for Online Source Extraction , 2015, LVA/ICA.

[12]  J. Chambers,et al.  Overcoming block permutation problem in frequency domain blind source separation when using AuxIVA algorithm , 2012 .

[13]  Shigeki Sagayama,et al.  An auxiliary-function approach to online independent vector analysis for real-time blind source separation , 2014, 2014 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA).

[14]  Shoko Araki,et al.  Equivalence between Frequency-Domain Blind Source Separation and Frequency-Domain Adaptive Beamforming for Convolutive Mixtures , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  Nobutaka Ono,et al.  Blind source separation on iPhone in real environment , 2013, 21st European Signal Processing Conference (EUSIPCO 2013).

[16]  Christopher V. Alvino,et al.  Geometric source separation: merging convolutive source separation with geometric beamforming , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).

[17]  Walter Kellermann,et al.  An improved combination of Directional BSS and a source localizer for robust source separation in rapidly time-varying acoustic scenarios , 2011, 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays.

[18]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[19]  Chao Liu,et al.  On linearly constrained minimum variance beamforming , 2015, J. Mach. Learn. Res..

[20]  H. Kameoka,et al.  Determined Blind Source Separation with Independent Low-Rank Matrix Analysis , 2018 .

[21]  Paris Smaragdis,et al.  Blind separation of convolved mixtures in the frequency domain , 1998, Neurocomputing.

[22]  Emmanuel Vincent,et al.  A Consolidated Perspective on Multimicrophone Speech Enhancement and Source Separation , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[23]  Walter Kellermann,et al.  Analysis of dual-channel ICA-based blocking matrix for improved noise estimation , 2014, EURASIP J. Adv. Signal Process..

[24]  Walter Kellermann,et al.  Spatially Informed Independent Vector Analysis , 2019 .

[25]  Walter Kellermann,et al.  Multidimensional localization of multiple sound sources using averaged directivity patterns of Blind Source Separation systems , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[26]  Kiyohiro Shikano,et al.  Blind source separation based on a fast-convergence algorithm combining ICA and beamforming , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[27]  Nobutaka Ono,et al.  Low-latency real-time blind source separation for hearing aids based on time-domain implementation of online independent vector analysis with truncation of non-causal components , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28]  Hiroshi Saruwatari,et al.  Vectorwise Coordinate Descent Algorithm for Spatially Regularized Independent Low-Rank Matrix Analysis , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).