Single-channel blind source separation based on joint dictionary with common sub-dictionary

The cross projection engenders when mixed speech signal is represented over joint dictionary because of the bad distinguishing ability of joint dictionary in single-channel blind source separation (SBSS) using sparse representation theory, which leads to bad separation performance. A new algorithm of constructing joint dictionary with common sub-dictionary is put forward in this paper to this problem. The new dictionary can effectively avoid being projected over another sub-dictionary when a source signal is represented over joint dictionary. In the new algorithm, firstly we learn identify sub-dictionaries using source speech signals corresponding to each speaker. And then we discard similar atoms between two identity sub-dictionaries and construct a common sub-dictionary using these similar atoms. Finally, we combine those three sub-dictionaries together into a joint dictionary. The Euclidean distance among two atoms is used to measure the correlation of them in different identity sub-dictionaries, and similar atoms are searched based on the correlation. In testing stage, each source can be reconstructed with the projection coefficients corresponding to individual sub-dictionary and the common sub-dictionary. Contrast experiments tested in speech database show that the algorithm proposed in this paper performs better, when the Signal-to-Noise Ratio (SNR) is used to measure separation effect. The algorithm set out in this paper has lower time complexity as well.

[1]  Dan Hu,et al.  Blind Source Separation: Theory and Applications , 2014 .

[2]  Saeid Sanei,et al.  Blind source separation of medial temporal discharges via partial dictionary learning , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).

[3]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[4]  Rama Chellappa,et al.  Edge Suppression by Gradient Field Transformation Using Cross-Projection Tensors , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[5]  Jarvis D. Haupt,et al.  Semi-blind source separation via sparse representations and online dictionary learning , 2012, 2013 Asilomar Conference on Signals, Systems and Computers.

[6]  Zhongfu Ye,et al.  Learning a Discriminative Dictionary for Single-Channel Speech Separation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[7]  Ying Chen,et al.  Speech reconstruction via sparse representation using harmonic regularization , 2015, 2015 International Conference on Wireless Communications & Signal Processing (WCSP).

[8]  Lili Huang,et al.  Coupled dictionary learning on common feature space for medical image super resolution , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[9]  Xiantong Zhen,et al.  Descriptor Learning via Supervised Manifold Regularization for Multioutput Regression , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Liu Hai-lin On Recoverability of Blind Source Separation Based on Sparse Representation , 2007 .

[11]  Suehiro Shimauchi,et al.  Hands‐free teleconferencing unit using auxiliary loudspeaker and duo‐filter echo canceller DSP , 1999 .

[12]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Zhang Yi,et al.  Underdetermined Blind Source Separation Using Sparse Coding , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Hakan Erdogan,et al.  Discriminative nonnegative dictionary learning using cross-coherence penalties for single channel source separation , 2013, INTERSPEECH.

[15]  Zhen Yang,et al.  Dictionary design in subspace model for speaker identification , 2015, Int. J. Speech Technol..

[16]  Michael Zibulevsky,et al.  Underdetermined blind source separation using sparse representations , 2001, Signal Process..

[17]  Sam T. Roweis,et al.  One Microphone Source Separation , 2000, NIPS.

[18]  Lei Zhang,et al.  Metaface learning for sparse representation based face recognition , 2010, 2010 IEEE International Conference on Image Processing.