Blind and Semi-blind Anechoic Mixing System Identification Using Multichannel Matching Pursuit

Sparse component analysis techniques have been successfully applied to the separation of speech sources. This paper presents an efficient algorithm based on the matching pursuit approach to deal with multichannel records. The proposed algorithm explicitly employs spatial constraints among different channels to express mixed signals as linear combinations of delayed components selected from an overcomplete dictionary. We present a new procedure for estimating the mixing system parameters (attenuations and delays), which can be applied to more than two mixtures and is not restricted to non-negative attenuation coefficients. The proposed mixing system estimation method can accommodate delays of greater magnitude than traditional approaches. In addition, learned dictionaries that improve the identification step can be used when excerpts from sources (exogenous to mixtures) are available. The simulation results show that semi-blind dictionaries perform better than those used in blind configurations.

[1]  Wei Dai,et al.  Sparse coding with adaptive dictionary learning for underdetermined blind speech separation , 2013, Speech Commun..

[2]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[3]  Diego B. Haddad,et al.  A New Initialization Method for Frequency-Domain Blind Source Separation Algorithms , 2011, IEEE Signal Processing Letters.

[4]  Abdeldjalil Aïssa-El-Bey,et al.  Underdetermined Blind Separation of Nondisjoint Sources in the Time-Frequency Domain , 2007, IEEE Transactions on Signal Processing.

[5]  Erkki Oja,et al.  Independent Component Analysis , 2001 .

[6]  Tom Barker,et al.  Low-latency sound-source-separation using non-negative matrix factorisation with coupled analysis and synthesis dictionaries , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[8]  V. G. Reju,et al.  Underdetermined instantaneous blind source separation of sparse signals with temporal structure using the state-space model , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Danny Crookes,et al.  CLOSE—A Data-Driven Approach to Speech Separation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Ronald A. DeVore,et al.  Some remarks on greedy algorithms , 1996, Adv. Comput. Math..

[11]  S. Mallat,et al.  Adaptive greedy approximations , 1997 .

[12]  Kjersti Engan,et al.  Multi-frame compression: theory and design , 2000, Signal Process..

[13]  Bob L. Sturm,et al.  Dark Energy in Sparse Atomic Estimations , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Valentin Emiya,et al.  Matching pursuit with stochastic selection , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[15]  Bob L. Sturm,et al.  Sparse Approximation and the Pursuit of Meaningful Signal Models With Interference Adaptation , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[16]  Pascal Frossard,et al.  Dictionary learning: What is the right representation for my signal? , 2011 .

[17]  Natasa Kovacevic,et al.  Algorithm 820: A flexible implementation of matching pursuit for Gabor functions on the interval , 2002, TOMS.

[18]  Laurent Girin,et al.  Informed Source Separation of Linear Instantaneous Under-Determined Audio Mixtures by Source Index Embedding , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[19]  Jérôme Bobin,et al.  Robust Sparse Blind Source Separation , 2015, IEEE Signal Processing Letters.

[20]  Simon J. Godsill,et al.  A Bayesian Approach for Blind Separation of Sparse Sources , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  Yannick Deville,et al.  A time-frequency blind signal separation method applicable to underdetermined mixtures of dependent sources , 2005, Signal Process..

[22]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[23]  Robert Rehr,et al.  On the Importance of Super-Gaussian Speech Priors for Machine-Learning Based Speech Enhancement , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[24]  S. Tatsuta,et al.  Estimation of propagation delays using orientation histograms for anechoic blind source separation , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[25]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[26]  Zhongfu Ye,et al.  Learning a Discriminative Dictionary for Single-Channel Speech Separation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[27]  Michel Pompeu Tcheou,et al.  Artificial Neural Networks For Dictionary Selection in Adaptive Greedy Decomposition Algorithms With Reduced Complexity , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[28]  Hiroshi Sawada,et al.  Underdetermined sparse source separation of convolutive mixtures with observation vector clustering , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[29]  DeLiang Wang,et al.  Deep Learning Based Binaural Speech Separation in Reverberant Environments , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[30]  P.S.R. Diniz,et al.  Efficient coherent adaptive representations of monitored electric signals in power systems using damped sinusoids , 2005, IEEE Transactions on Signal Processing.

[31]  Michael Zibulevsky,et al.  Underdetermined blind source separation using sparse representations , 2001, Signal Process..

[32]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[33]  Masataka Goto,et al.  Simultaneous processing of sound source separation and musical instrument identification using Bayesian spectral modeling , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[34]  Michael Zibulevsky,et al.  Sparse Component Analysis , 2010 .

[35]  Bo Du,et al.  Ensemble manifold regularized sparse low-rank approximation for multiview feature embedding , 2015, Pattern Recognit..

[36]  Christophe De Vleeschouwer,et al.  Subband dictionaries for low-cost matching pursuits of video residues , 1999, IEEE Trans. Circuits Syst. Video Technol..

[37]  Xiaoying Sun,et al.  DOA and power estimation using a sparse representation of second-order statistics vector and l0-norm approximation , 2014, Signal Process..

[38]  Yannick Deville,et al.  Blind separation of dependent sources using the "time-frequency ratio of mixtures" approach , 2003, Seventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings..

[39]  S. Easter Selvan,et al.  Nonsmooth ICA Contrast Minimization Using a Riemannian Nelder–Mead Method , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[40]  Salvador Villarreal,et al.  High performance heart sound segmentation algorithm based on Matching Pursuit , 2013, 2013 IEEE Digital Signal Processing and Signal Processing Education Meeting (DSP/SPE).

[41]  Emmanuel Vincent,et al.  Fusion Methods for Speech Enhancement and Audio Source Separation , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[42]  Scott Rickard,et al.  The DUET Blind Source Separation Algorithm , 2007, Blind Speech Separation.

[43]  Joel A. Tropp,et al.  Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[44]  Emmanuel Vincent,et al.  A Consolidated Perspective on Multimicrophone Speech Enhancement and Source Separation , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[45]  Bob L. Sturm,et al.  Agglomerative clustering in sparse atomic decompositions of audio signals , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[46]  Michael Elad,et al.  Analysis K-SVD: A Dictionary-Learning Algorithm for the Analysis Sparse Model , 2013, IEEE Transactions on Signal Processing.

[47]  Eduardo A. B. da Silva,et al.  On the statistics of matching pursuit angles , 2010, Signal Process..

[48]  Swati Goel,et al.  ICA in Image Processing: A Survey , 2015, 2015 IEEE International Conference on Computational Intelligence & Communication Technology.

[49]  Jonathon A. Chambers,et al.  IVA algorithms using a multivariate Student's t source prior for speech source separation in real room environments , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[50]  Joel A. Tropp,et al.  Just relax: convex programming methods for identifying sparse signals in noise , 2006, IEEE Transactions on Information Theory.

[51]  S. Qian Introduction to Time-Frequency and Wavelet Transforms , 2001 .

[52]  Mark D. Plumbley,et al.  Fast Dictionary Learning for Sparse Representations of Speech Signals , 2011, IEEE Journal of Selected Topics in Signal Processing.

[53]  Christian Ritz,et al.  Separation of multiple speech sources by recovering sparse and non-sparse components from B-format microphone recordings , 2018, Speech Commun..

[54]  D. Donoho,et al.  Atomic Decomposition by Basis Pursuit , 2001 .

[55]  Christopher J. James,et al.  On Semi-Blind Source Separation Using Spatial Constraints With Applications in EEG Analysis , 2006, IEEE Transactions on Biomedical Engineering.

[56]  Fabian J. Theis,et al.  Sparse component analysis and blind source separation of underdetermined mixtures , 2005, IEEE Transactions on Neural Networks.

[57]  Francesco Nesta,et al.  Unsupervised spatial dictionary learning for sparse underdetermined multichannel source separation , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[58]  Rémi Gribonval Sparse decomposition of stereo signals with Matching Pursuit and application to blind separation of more than two sources from a stereo mixture , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[59]  S. Mallat A wavelet tour of signal processing , 1998 .

[60]  Pascal Frossard,et al.  Dictionary Learning , 2011, IEEE Signal Processing Magazine.

[61]  Shi-Wei Dong,et al.  A Heuristic Optimal Discrete Bit Allocation Algorithm for Margin Maximization in DMT Systems , 2007, EURASIP J. Adv. Signal Process..

[62]  Daniel W. C. Ho,et al.  Underdetermined blind source separation based on sparse representation , 2006, IEEE Transactions on Signal Processing.

[63]  Roland Badeau,et al.  Semi-blind student's t source separation for multichannel audio convolutive mixtures , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[64]  Christian Jutten,et al.  Dictionary learning for sparse decomposition: A new criterion and algorithm , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[65]  Michael Elad,et al.  Dictionaries for Sparse Representation Modeling , 2010, Proceedings of the IEEE.

[66]  Li Jia,et al.  Nonnegative Matrix Factorization With Regularizations , 2014, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[67]  M.M. Goodwin,et al.  Multichannel Matching Pursuit and Applications to Spatial Audio Coding , 2006, 2006 Fortieth Asilomar Conference on Signals, Systems and Computers.

[68]  Chih-Ming Wang,et al.  Robust separation of background and target signals in radar cross section measurements , 2005, IEEE Transactions on Instrumentation and Measurement.

[69]  Sacha Krstulovic,et al.  Under-Determined Source Separation: Comparison of Two Approaches Based on Sparse Decompositions , 2006, ICA.

[70]  Eduardo A. B. da Silva,et al.  Modeling of Electric Disturbance Signals Using Damped Sinusoids via Atomic Decompositions and Its Applications , 2007, EURASIP J. Adv. Signal Process..

[71]  Barak A. Pearlmutter,et al.  Blind Source Separation by Sparse Decomposition in a Signal Dictionary , 2001, Neural Computation.

[72]  Y. Deville,et al.  Time–frequency ratio-based blind separation methods for attenuated and time-delayed sources , 2005 .

[73]  DeLiang Wang,et al.  Two-Microphone Separation of Speech Mixtures , 2008, IEEE Transactions on Neural Networks.

[74]  Andrzej Cichocki,et al.  Nonholonomic Orthogonal Learning Algorithms for Blind Source Separation , 2000, Neural Computation.

[75]  Barak A. Pearlmutter,et al.  Blind source separation by sparse decomposition , 2000, SPIE Defense + Commercial Sensing.

[76]  E. Oja,et al.  Independent Component Analysis , 2013 .

[77]  Ahmed H. Tewfik,et al.  Dictionary and sparse decomposition method selection for underdetermined blind source separation , 2007, 2007 15th European Signal Processing Conference.

[78]  Cedric Nishan Canagarajah,et al.  Underdetermined noisy blind separation using dual matching pursuits , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[79]  Jean Rouat,et al.  Blind Speech Separation and Enhancement With GCC-NMF , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[80]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[81]  Rémi Gribonval,et al.  Audio source separation with a single sensor , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[82]  Paris Smaragdis,et al.  Deep learning for monaural speech separation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[83]  P. Frossard,et al.  Tree-Based Pursuit: Algorithm and Properties , 2006, IEEE Transactions on Signal Processing.

[84]  Wenxing Zhu,et al.  An improvement of the penalty decomposition method for sparse approximation , 2015, Signal Process..

[85]  J. Friedman,et al.  Projection Pursuit Regression , 1981 .