Single Channel Speech Separation Using an Efficient Model-based Method

The subject of extracting multiple speech signals from a single mixed recording, which is referred to single channel speech separation, has received considerable attention in recent years and many model-based techniques have been proposed. A major problem of most of these systems is their inability to deal with the situation in which the signals are combined at different levels of energies because they assume that the data used in the test and training phase have equal levels of energies, where, this assumption hardly occurs in reality. Our proposed method based on MIXMAX approximation and sub- section vector quantization (VQ) is an attempt to overcome this limitation. The proposed technique is compared with a technique in which a gain adapted minimum mean square error estimator is derived to estimate the separated signals. Through experiments we show that our proposed method outperforms this method in terms of SNR results and also reduces computational complexity.

[1]  Matti Karjalainen,et al.  A computationally efficient multipitch analysis model , 2000, IEEE Trans. Speech Audio Process..

[2]  Richard M. Dansereau,et al.  Single-Channel Speech Separation Using Soft Mask Filtering , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Jon Barker,et al.  An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.

[4]  Guy J. Brown,et al.  A multi-pitch tracking algorithm for noisy speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  A. Banihashemi,et al.  A Non-Linear Minimum Mean Square Error Estimator for the Mixture-Maximization Approximation , 2006 .

[6]  B. Bradie A Friendly Introduction to Numerical Analysis , 2003 .

[7]  A. Sayadiyan,et al.  A Novel Low Complexity VQ-Based Single Channel Speech Separation Technique , 2006, 2006 IEEE International Symposium on Signal Processing and Information Technology.

[8]  Hamid Sheikhzadeh,et al.  Evaluating single-channel speech separation performance in transform-domain , 2010, Journal of Zhejiang University SCIENCE C.

[9]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[10]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[11]  Guy J. Brown,et al.  Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006 .

[12]  Bhiksha Raj,et al.  Soft Mask Methods for Single-Channel Speaker Separation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  DeLiang Wang,et al.  Monaural speech segregation based on pitch tracking and amplitude modulation , 2002, IEEE Transactions on Neural Networks.

[14]  Richard M. Dansereau,et al.  A Maximum Likelihood Estimation of Vocal-Tract-Related Filter Characteristics for Single Channel Speech Separation , 2006, EURASIP J. Audio Speech Music. Process..

[15]  Richard M. Dansereau,et al.  Speaker-independent model-based single channel speech separation , 2008, Neurocomputing.

[16]  DeLiang Wang,et al.  A model for multitalker speech perception. , 2008, The Journal of the Acoustical Society of America.

[17]  Andreas Jakobsson,et al.  Multi-Pitch Estimation , 2009, Multi-Pitch Estimation.

[18]  Richard M. Stern,et al.  Single-channel speech separation based on modulation frequency , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  Daniel P. W. Ellis,et al.  Multiband audio modeling for single-channel acoustic source separation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[20]  Richard M. Dansereau,et al.  Monaural Speech Separation Based on Gain Adapted Minimum Mean Square Error Estimation , 2010, J. Signal Process. Syst..