A Survey on Single Channel Speech Separation

Single channel speech separation is a branch of speech separation process, which is an ongoing interesting research topic for the past 40 years and continues till now, but still there is a lack in separating the required signal from the mixture of signals with 100% accuracy and be used by the common people. Many researches have been done in various ways using the parameters like pitch, phase, magnitude, amplitude, frequency and energy, spectrogram of the speech signal. Various issues in single channel speech separation process are surveyed in this paper and the major challenges faced by the speech research community in realizing the system are pointed out as conclusion.

[1]  G.-J. Jang,et al.  Single-channel signal separation using time-domain basis functions , 2003, IEEE Signal Processing Letters.

[2]  Ronald W. Schafer,et al.  Digital Processing of Speech Signals , 1978 .

[3]  Te-Won Lee,et al.  A Maximum Likelihood Approach to Single-channel Source Separation , 2003, J. Mach. Learn. Res..

[4]  Richard M. Dansereau,et al.  Single-Channel Speech Separation Using Soft Mask Filtering , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Oh-Wook Kwon,et al.  Single-channel speech separation using phase-based methods , 2010, IEEE Transactions on Consumer Electronics.

[6]  Y. Iiguni,et al.  Single-channel speech separation by using a sparse decomposition with periodic structure , 2009, 2008 International Symposium on Intelligent Signal Processing and Communications Systems.

[7]  Danny Crookes,et al.  A Corpus-Based Approach to Speech Enhancement From Nonstationary Noise , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  DeLiang Wang,et al.  Monaural speech segregation based on pitch tracking and amplitude modulation , 2002, IEEE Transactions on Neural Networks.

[9]  Gary G. R. Green,et al.  Extracting amplitude modulations from speech in the time domain , 2011, Speech Commun..

[10]  Les E. Atlas,et al.  Feasibility of Single Channel Speaker Separation Based on Modulation Frequency Analysis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[11]  Sam T. Roweis,et al.  Factorial models and refiltering for speech separation and denoising , 2003, INTERSPEECH.

[12]  Qinghua Huang,et al.  Single-channel speech separation based on long-short frame associated harmonic model , 2011, Digit. Signal Process..

[13]  Guy J. Brown,et al.  Computational auditory scene analysis , 1994, Comput. Speech Lang..

[14]  Tomi Kinnunen,et al.  Sinusoidal Approach for the Single-Channel Speech Separation and Recognition Challenge , 2011, INTERSPEECH.

[15]  W. L. Woo,et al.  Single-Channel Source Separation Using EMD-Subband Variable Regularized Sparse Features , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[16]  Bhiksha Raj,et al.  Soft Mask Methods for Single-Channel Speaker Separation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  H. Soltanian-Zadeh,et al.  Single channel speech separation with a frame-based pitch range estimation method in modulation frequency , 2010, 2010 5th International Symposium on Telecommunications.

[18]  Sam T. Roweis,et al.  One Microphone Source Separation , 2000, NIPS.

[19]  Michael I. Jordan,et al.  Blind One-microphone Speech Separation: A Spectral Learning Approach , 2004, NIPS.

[20]  Søren Holdt Jensen,et al.  New Results on Single-Channel Speech Separation Using Sinusoidal Modeling , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  T. W. Parsons Separation of speech from interfering speech by means of harmonic selection , 1976 .

[22]  L. Atlas,et al.  Single-Channel Source Separation Using Complex Matrix Factorization , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[23]  Ning Ma,et al.  Recent advances in speech fragment decoding techniques , 2006, INTERSPEECH.

[24]  Franz Pernkopf,et al.  Source–Filter-Based Single-Channel Speech Separation Using Pitch Information , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[25]  Paris Smaragdis,et al.  An adaptive time-frequency resolution approach for Non-negative Matrix Factorization based single channel sound source separation , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[26]  Francis Charpentier,et al.  Pitch detection using the short-term phase spectrum , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[27]  Michael I. Jordan,et al.  Discriminative training of hidden Markov models for multiple pitch tracking [speech processing examples] , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[28]  Hideki Kawahara,et al.  Multiple period estimation and pitch perception model , 1999, Speech Commun..

[29]  Carol Y. Espy-Wilson,et al.  An algorithm for speech segregation of co-channel speech , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[30]  Mitchel Weintraub,et al.  A computational model for separating two simultaneous talkers , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[31]  Richard M. Stern,et al.  Single-channel speech separation based on instantaneous frequency , 2010 .