Multi-Channel Source Separation: Overview and Comparison of Mask-based and Linear Separation Algorithms

Separation of speech signals in noisy multi-speaker environments – as exemplified by the cocktail party problem – is a feasible task for a human listener, even under rather severe conditions such as many interfering speakers and loud background noise. Algorithms that help machines perform a similar task are currently the hotbed of research due to their wide range of applications, such as in human-machine interfaces, digital hearing aids, intelligent robots and so on. abStract

[1]  Justinian P. Rosca,et al.  REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION , 2001 .

[2]  D. Obradovic,et al.  Independent component analysis for semi-blind signal separation in MIMO mobile frequency selective communication channels , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[3]  Fiona Ross Digital Typeface Design and Font Development for Twenty-First Century Bangla Language Processing , 2013 .

[4]  Aapo Hyvärinen,et al.  A Fast Fixed-Point Algorithm for Independent Component Analysis , 1997, Neural Computation.

[5]  Hiroshi Sawada,et al.  A robust and precise method for solving the permutation problem of frequency-domain blind source separation , 2004, IEEE Transactions on Speech and Audio Processing.

[6]  DeLiang Wang,et al.  Time-Frequency Masking for Speech Separation and Its Potential for Hearing Aid Design , 2008 .

[7]  Alex Acero,et al.  Robust Adaptive Beamforming Algorithm using Instantaneous Direction of Arrival with Enhanced Noise Suppression Capability , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[8]  Rainer Martin,et al.  Speech enhancement using MMSE short time spectral estimation with gamma distributed speech priors , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[10]  Walter Kellermann,et al.  TRINICON: a versatile framework for multichannel blind signal processing , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Hiroshi Sawada,et al.  BLIND SPEECH SEPARATION BY COMBINING BEAMFORMERS AND A TIME FREQUENCY BINARY MASK , 2006 .

[12]  Marc Moonen,et al.  Variable Speech Distortion Weighted Multichannel Wiener Filter based on Soft Output Voice Activity Detection for Noise Reduction in Hearing Aids , 2008 .

[13]  Hector Perez Meana Advances in Audio and Speech Signal Processing: Technologies and Applications , 2007 .

[14]  Steven F. Boll,et al.  Optimal estimators for spectral restoration of noisy speech , 1984, ICASSP.

[15]  Dennis R. Morgan,et al.  A beamforming approach to permutation alignment for multichannel frequency-domain blind speech separation , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16]  K. Matsuoka,et al.  Minimal distortion principle for blind source separation , 2002, Proceedings of the 41st SICE Annual Conference. SICE 2002..

[17]  Walter Kellermann,et al.  Residual Cross-talk and Noise Suppression for Convolutive Blind Source Separation , 2006 .

[18]  Akihiko Sugiyama,et al.  Robust Adaptive Beamforming , 2001, Microphone Arrays.

[19]  Jacob Benesty,et al.  Analysis and Comparison of Multichannel Noise Reduction Methods in a Common Framework , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  Louis H. Terry,et al.  Audio-Visual and Visual-Only Speech and Speaker Recognition: Issues about Theory, System Design, and Implementation , 2008 .

[21]  Henry Cox,et al.  Robust adaptive beamforming , 2005, IEEE Trans. Acoust. Speech Signal Process..

[22]  Dinh-Tuan Pham,et al.  Permutation Correction in the Frequency Domain in Blind Separation of Speech Mixtures , 2006, EURASIP J. Adv. Signal Process..

[23]  Bernie Mulgrew,et al.  Perceptually motivated blind source separation of convolutive audio mixtures with subspace filtering methods , 2005 .

[24]  C. Fancourt,et al.  The coherence function in blind source separation of convolutive mixtures of non-stationary signals , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).

[25]  Özgür Yilmaz,et al.  Blind separation of disjoint orthogonal signals: demixing N sources from 2 mixtures , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[26]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[27]  Nikolaos Mitianoudis,et al.  Permutation Alignment for Frequency Domain ICA Using Subspace Beamforming Methods , 2004, ICA.

[28]  Paris Smaragdis,et al.  Blind separation of convolved mixtures in the frequency domain , 1998, Neurocomputing.

[29]  Özgür Yilmaz,et al.  On the approximate W-disjoint orthogonality of speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[30]  Justinian P. Rosca,et al.  Generalized sparse signal mixing model and application to noisy blind source separation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[31]  Marc Moonen,et al.  Speech Distortion Weighted Multichannel Wiener Filtering Techniques for Noise Reduction , 2005 .

[32]  O. L. Frost,et al.  An algorithm for linearly constrained adaptive array processing , 1972 .

[33]  Hiroshi Sawada,et al.  A NOVEL BLIND SOURCE SEPARATION METHOD WITH OBSERVATION VECTOR CLUSTERING , 2005 .

[34]  Saeid Sanei,et al.  A Novel Hybrid Approach to the Permutation Problem of Frequency Domain Blind Source Separation , 2004, ICA.

[35]  Shoko Araki,et al.  Equivalence between Frequency-Domain Blind Source Separation and Frequency-Domain Adaptive Beamforming for Convolutive Mixtures , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[36]  Dorothea Kolossa,et al.  Nonlinear Postprocessing for Blind Speech Separation , 2004, ICA.

[37]  Rainer Martin,et al.  COMBINED BEAMFORMING AND FREQUENCY DOMAIN ICA FOR SOURCE SEPARATION , 2006 .

[38]  Christopher V. Alvino,et al.  Geometric source separation: merging convolutive source separation with geometric beamforming , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).

[39]  Michael S. Brandstein,et al.  Robust Localization in Reverberant Rooms , 2001, Microphone Arrays.

[40]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[41]  Mohammad A. Karim,et al.  Technical Challenges and Design Issues in Bangla Language Processing , 2013 .

[42]  Guy J. Brown,et al.  Speech segregation based on sound localization , 2003 .

[43]  Kiyohiro Shikano,et al.  Blind source separation based on a fast-convergence algorithm combining ICA and beamforming , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[44]  Rainer Martin,et al.  Cepstral Smoothing of Spectral Filter Gains for Speech Enhancement Without Musical Noise , 2007, IEEE Signal Processing Letters.

[45]  Dorothea Kolossa,et al.  Beamforming-based convolutive source separation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[46]  S. Araki,et al.  A POLAR-COORDINATE BASED ACTIVATION FUNCTION FOR FREQUENCY DOMAIN BLIND SOURCE SEPARATION , 2001 .

[47]  Konstantinos I. Diamantaras,et al.  Blind Source Separation Using Principal Component Neural Networks , 2001, ICANN.

[48]  Jacob Benesty,et al.  A Minimum Distortion Noise Reduction Algorithm With Multiple Microphones , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[49]  Ehud Weinstein,et al.  Signal enhancement using beamforming and nonstationarity with applications to speech , 2001, IEEE Trans. Signal Process..

[50]  Dipankar Das,et al.  Building Language Resources for Emotion Analysis in Bengali , 2013 .

[51]  Alan Wee-Chung Liew,et al.  Visual Speech Recognition: Lip Segmentation and Mapping , 2008 .

[52]  L. J. Griffiths,et al.  An alternative approach to linearly constrained adaptive beamforming , 1982 .