Differentiable Max-Directivity Beamforming Normalization for Independent Vector Analysis

Independent vector analysis (IVA) minimizes an objective function to estimate separation filters that separate mixture signals into individual source signals. Unfortunately, IVA often suffers from the well-known block permutation problem. To mitigate that problem, the use of geometry knowledge has been studied, but two crucial issues remain: the necessity of non-differential processes outside the minimization and of high-level geometrical clues such as the directions of arrival (DOAs) of the source signals. This paper thus presents a novel IVA method whose objective function has a differentiable max-directivity beamforming normalization (MDBN) term. This term uses geometry knowledge from only a low-level geometrical clue (the positions of the microphone array) via the traditional beamforming (BF) concept that each separation filter should have a maximum gain for each specific DOA across all frequency bins. Thus, our overall objective function can be minimized by gradient descent, and the MDBN term encourages the separation filters to focus on specific directions, which implicitly estimates the most reasonable DOAs of the source signals at each iteration. Therefore, our method uses geometry knowledge while avoiding the above two issues and estimates good separation filters mitigating the permutation problem. Several experiments show that our method outperforms the conventional BF and IVA methods.

[1]  J. Capon High-resolution frequency-wavenumber spectrum analysis , 1969 .

[2]  Hirokazu Kameoka,et al.  Determined Blind Source Separation Unifying Independent Vector Analysis and Nonnegative Matrix Factorization , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[3]  Te-Won Lee,et al.  Blind Speech Separation , 2007, Blind Speech Separation.

[4]  Francesco Nesta,et al.  Supervised independent vector analysis through pilot dependent components , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Ivan Himawan,et al.  Microphone Array Beamforming Approach to Blind Speech Separation , 2007, MLMI.

[6]  Te-Won Lee,et al.  Blind Source Separation Exploiting Higher-Order Frequency Dependencies , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Xavier Anguera Miró,et al.  Acoustic Beamforming for Speaker Diarization of Meetings , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Fulvio Gini,et al.  Adaptive and Learning Systems for Signal Processing, Communications, and Control , 2008 .

[9]  Hiroshi Sawada,et al.  Frequency-Domain Blind Source Separation , 2007, Blind Speech Separation.

[10]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[11]  Walter Kellermann,et al.  Spatially Informed Independent Vector Analysis , 2019 .

[12]  Emanuel A. P. Habets,et al.  A Geometrically Constrained Independent Vector Analysis Algorithm for Online Source Extraction , 2015, LVA/ICA.

[13]  Hirokazu Kameoka,et al.  A review of blind source separation methods: two converging routes to ILRMA originating from ICA and NMF , 2019, APSIPA Transactions on Signal and Information Processing.

[14]  Bhaskar D. Rao,et al.  Combining independent component analysis with geometric information and its application to speech processing , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[15]  Yuuki Tachioka Member,et al.  Semi-blind source separation using binary masking and independent vector analysis , 2014 .

[16]  Kiyohiro Shikano,et al.  Fast Convergence Blind Source Separation Using Frequency Subband Interpolation by Null Beamforming , 2008, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[17]  Kiyohiro Shikano,et al.  Blind source separation based on a fast-convergence algorithm combining ICA and beamforming , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  S.C. Douglas,et al.  Multichannel blind deconvolution and equalization using the natural gradient , 1997, First IEEE Signal Processing Workshop on Signal Processing Advances in Wireless Communications.

[19]  Nobutaka Ono,et al.  Auxiliary-function-based independent vector analysis with power of vector-norm type weighting functions , 2012, Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.

[20]  Nobutaka Ono,et al.  Stable and fast update rules for independent vector analysis based on auxiliary function technique , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[21]  Chengzhu Yu,et al.  The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[22]  Kazuya Takeda,et al.  Evaluation of blind signal separation method using directivity pattern under reverberant conditions , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[23]  Atsuo Hiroe,et al.  Solution of Permutation Problem in Frequency Domain ICA, Using Multivariate Probability Density Functions , 2006, ICA.