Consistent independent low-rank matrix analysis for determined blind source separation

Independent low-rank matrix analysis (ILRMA) is the state-of-the-art algorithm for blind source separation (BSS) in the determined situation (the number of microphones is greater than or equal to that of source signals). ILRMA achieves a great separation performance by modeling the power spectrograms of the source signals via the nonnegative matrix factorization (NMF). Such highly developed source model can effectively solve the permutation problem of the frequency-domain BSS, which should be the reason of the excellence of ILRMA. In this paper, we further improve the separation performance of ILRMA by additionally considering the general structure of spectrogram called consistency, and hence we call the proposed method Consistent ILRMA. Since a spectrogram is calculated by an overlapping window (and a window function induces spectral smearing called main- and side-lobes), the time-frequency bins depend on each other. In other words, the time-frequency components are related each other via the uncertainty principle. Such co-occurrence among the spectral components can be an assistant for solving the permutation problem, which has been demonstrated by a recent study. Based on these facts, we propose an algorithm for realizing Consistent ILRMA by slightly modifying the original algorithm. Its performance was extensively studied through the experiments performed with various window lengths and shift lengths. The results indicated several tendencies of the original and proposed ILRMA which include some topics have not discussed well in the literature. For example, the proposed Consistent ILRMA tends to outperform the original ILRMA when the window length is sufficiently long compared to the reverberation time of the mixing system.

[1]  Yasuhiro Oikawa,et al.  Deep Griffin–Lim Iteration , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Hiroshi Saruwatari,et al.  Generalized independent low-rank matrix analysis using heavy-tailed distributions for blind source separation , 2018, EURASIP J. Adv. Signal Process..

[3]  Yasuhiro Oikawa,et al.  Griffin–Lim Like Phase Recovery via Alternating Direction Method of Multipliers , 2019, IEEE Signal Processing Letters.

[4]  Kazuya Takeda,et al.  Evaluation of blind signal separation method using directivity pattern under reverberant conditions , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[5]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[6]  Michael S. Brandstein,et al.  Microphone Arrays - Signal Processing Techniques and Applications , 2001, Microphone Arrays.

[7]  Shoko Araki,et al.  Equivalence between Frequency-Domain Blind Source Separation and Frequency-Domain Adaptive Beamforming for Convolutive Mixtures , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Yasuhiro Oikawa,et al.  Representation of complex spectrogram via phase conversion , 2019, Acoustical Science and Technology.

[9]  Satoshi Nakamura,et al.  Acoustical Sound Database in Real Environments for Sound Scene Understanding and Hands-Free Speech Recognition , 2000, LREC.

[10]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[11]  Atsuo Hiroe,et al.  Solution of Permutation Problem in Frequency Domain ICA, Using Multivariate Probability Density Functions , 2006, ICA.

[12]  Donald S. Williamson,et al.  Impact of phase estimation on single-channel speech separation based on time-frequency masking. , 2017, The Journal of the Acoustical Society of America.

[13]  Jonathan Le Roux,et al.  Phase Processing for Single-Channel Speech Enhancement: History and recent advances , 2015, IEEE Signal Processing Magazine.

[14]  Hiroshi Sawada,et al.  Independent Low-Rank Matrix Analysis with Decorrelation Learning , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[15]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[16]  Andreas Ziehe,et al.  The 2011 Signal Separation Evaluation Campaign (SiSEC2011): - Audio Source Separation - , 2012, LVA/ICA.

[17]  Hiroshi Saruwatari,et al.  Independent Low-Rank Matrix Analysis Based on Time-Variant Sub-Gaussian Source Model for Determined Blind Source Separation , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[18]  Rintaro Ikeshita Independent Positive Semidefinite Tensor Analysis in Blind Source Separation , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).

[19]  Kiyohiro Shikano,et al.  Blind source separation based on a fast-convergence algorithm combining ICA and beamforming , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  J. Chambers,et al.  Overcoming block permutation problem in frequency domain blind source separation when using AuxIVA algorithm , 2012 .

[21]  Deep Sen,et al.  Iterative Phase Estimation for the Synthesis of Separated Sources From Single-Channel Mixtures , 2010, IEEE Signal Processing Letters.

[22]  Yasuhiro Oikawa,et al.  Model-Based Phase Recovery of Spectrograms via Optimization on Riemannian Manifolds , 2018, 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC).

[23]  Shinnosuke Takamichi,et al.  Blind source separation based on independent low-rank matrix analysis with sparse regularization for time-series activity , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  Nicolas Sturmel Phase-based Informed Source Separation of Music , 2012 .

[25]  Hirokazu Kameoka,et al.  Joint Separation and Dereverberation of Reverberant Mixtures with Determined Multichannel Non-Negative Matrix Factorization , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[26]  Masahito Togami,et al.  Multi-Channel Speech Source Separation and Dereverberation With Sequential Integration of Determined and Underdetermined Models , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27]  Daichi Kitahara,et al.  Determined Source Separation Using the Sparsity of Impulse Responses , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28]  Rémi Gribonval,et al.  Beyond the Narrowband Approximation: Wideband Convex Methods for Under-Determined Reverberant Audio Source Separation , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[29]  Mario Kaoru Watanabe,et al.  Iterative sinusoidal-based partial phase reconstruction in single-channel source separation , 2013, INTERSPEECH.

[30]  Hiroshi Sawada,et al.  A robust and precise method for solving the permutation problem of frequency-domain blind source separation , 2004, IEEE Transactions on Speech and Audio Processing.

[31]  Yasuhiro Oikawa,et al.  Phase Corrected Total Variation for Audio Signals , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[32]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[33]  Shinnosuke Takamichi,et al.  Independent Deeply Learned Matrix Analysis for Determined Audio Source Separation , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[34]  Paris Smaragdis,et al.  Blind separation of convolved mixtures in the frequency domain , 1998, Neurocomputing.

[35]  K. Matsuoka,et al.  Minimal distortion principle for blind source separation , 2002, Proceedings of the 41st SICE Annual Conference. SICE 2002..

[36]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[37]  Jae S. Lim,et al.  Signal estimation from modified short-time Fourier transform , 1983, ICASSP.

[38]  Paul R. White,et al.  Speech spectral amplitude estimators using optimally shaped Gamma and Chi priors , 2009, Speech Commun..

[39]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[40]  Te-Won Lee,et al.  Independent Vector Analysis: An Extension of ICA to Multivariate Components , 2006, ICA.

[41]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[42]  Kohei Yatabe Consistent ICA: Determined BSS Meets Spectrogram Consistency , 2020, IEEE Signal Processing Letters.

[43]  Hirokazu Kameoka,et al.  Determined Blind Source Separation Unifying Independent Vector Analysis and Nonnegative Matrix Factorization , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[44]  Jonathan Le Roux,et al.  FAST SIGNAL RECONSTRUCTION FROM MAGNITUDE STFT SPECTROGRAM BASED ON SPECTROGRAM CONSISTENCY , 2010 .

[45]  Yasuhiro Oikawa,et al.  Phase-aware Harmonic/percussive Source Separation via Convex Optimization , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[46]  Tatsuya Kawahara,et al.  Independent Low-Rank Tensor Analysis for Audio Source Separation , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).

[47]  Takayuki Hoshino,et al.  Independent Low-Rank Matrix Analysis-Based Automatic Artifact Reduction Technique Applied to Three BCI Paradigms , 2020, Frontiers in Human Neuroscience.

[48]  Yasuhiro Oikawa,et al.  Underdetermined Source Separation with Simultaneous DOA Estimation Without Initial Value Dependency , 2018, 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC).

[49]  Yasuhiro Oikawa,et al.  Low-rankness of Complex-valued Spectrogram and Its Application to Phase-aware Audio Processing , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[50]  Peter L. Søndergaard,et al.  A fast Griffin-Lim algorithm , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[51]  Yohei Kawaguchi,et al.  Independent Low-Rank Matrix Analysis Based on Multivariate Complex Exponential Power Distribution , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[52]  Andreas Ziehe,et al.  An approach to blind source separation based on temporal structure of speech signals , 2001, Neurocomputing.

[53]  Yasuhiro Oikawa,et al.  Phase Reconstruction Based On Recurrent Phase Unwrapping With Deep Neural Networks , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[54]  Te-Won Lee,et al.  Blind Source Separation Exploiting Higher-Order Frequency Dependencies , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[55]  Daichi Kitamura,et al.  Determined Blind Source Separation via Proximal Splitting Algorithm , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[56]  Daichi Kitamura,et al.  Time-frequency-masking-based Determined BSS with Application to Sparse IVA , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[57]  Nancy Bertin,et al.  Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[58]  Pejman Mowlaee,et al.  Single Channel Phase-Aware Signal Processing in Speech Communication: Theory and Practice , 2016 .

[59]  Tatsuya Kawahara,et al.  Semi-Supervised Multichannel Speech Enhancement With a Deep Speech Prior , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[60]  Jonathan Le Roux,et al.  Consistent Wiener Filtering for Audio Source Separation , 2013, IEEE Signal Processing Letters.

[61]  Pejman Mowlaee Begzade Mahale,et al.  Single-channel speech enhancement with correlated spectral components: Limits-potential , 2020, Speech Commun..

[62]  H. Kameoka,et al.  Determined Blind Source Separation with Independent Low-Rank Matrix Analysis , 2018 .

[63]  Daichi Kitamura,et al.  Determined BSS Based on Time-Frequency Masking and Its Application to Harmonic Vector Analysis , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[64]  Hiroshi Saruwatari,et al.  Experimental analysis of optimal window length for independent low-rank matrix analysis , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[65]  Nobutaka Ono,et al.  Stable and fast update rules for independent vector analysis based on auxiliary function technique , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[66]  Yasuhiro Oikawa,et al.  Rectified Linear Unit Can Assist Griffin-Lim Phase Recovery , 2018, 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC).

[67]  Yannis Stylianou,et al.  Advances in phase-aware signal processing in speech communication , 2016, Speech Commun..