论文信息 - Multi-Channel Source Separation: Overview and Comparison of Mask-based and Linear Separation Algorithms

Multi-Channel Source Separation: Overview and Comparison of Mask-based and Linear Separation Algorithms

Separation of speech signals in noisy multi-speaker environments – as exemplified by the cocktail party problem – is a feasible task for a human listener, even under rather severe conditions such as many interfering speakers and loud background noise. Algorithms that help machines perform a similar task are currently the hotbed of research due to their wide range of applications, such as in human-machine interfaces, digital hearing aids, intelligent robots and so on. abStract

Nilesh Madhu | André Gückel | N. Madhu | André Gückel

[1] Justinian P. Rosca,et al. REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION , 2001 .

[2] D. Obradovic,et al. Independent component analysis for semi-blind signal separation in MIMO mobile frequency selective communication channels , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[3] Fiona Ross. Digital Typeface Design and Font Development for Twenty-First Century Bangla Language Processing , 2013 .

[4] Aapo Hyvärinen,et al. A Fast Fixed-Point Algorithm for Independent Component Analysis , 1997, Neural Computation.

[5] Hiroshi Sawada,et al. A robust and precise method for solving the permutation problem of frequency-domain blind source separation , 2004, IEEE Transactions on Speech and Audio Processing.

[6] DeLiang Wang,et al. Time-Frequency Masking for Speech Separation and Its Potential for Hearing Aid Design , 2008 .

[7] Alex Acero,et al. Robust Adaptive Beamforming Algorithm using Instantaneous Direction of Arrival with Enhanced Noise Suppression Capability , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[8] Rainer Martin,et al. Speech enhancement using MMSE short time spectral estimation with gamma distributed speech priors , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9] Aapo Hyvärinen,et al. Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[10] Walter Kellermann,et al. TRINICON: a versatile framework for multichannel blind signal processing , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11] Hiroshi Sawada,et al. BLIND SPEECH SEPARATION BY COMBINING BEAMFORMERS AND A TIME FREQUENCY BINARY MASK , 2006 .

[12] Marc Moonen,et al. Variable Speech Distortion Weighted Multichannel Wiener Filter based on Soft Output Voice Activity Detection for Noise Reduction in Hearing Aids , 2008 .

[13] Hector Perez Meana. Advances in Audio and Speech Signal Processing: Technologies and Applications , 2007 .

[14] Steven F. Boll,et al. Optimal estimators for spectral restoration of noisy speech , 1984, ICASSP.

[15] Dennis R. Morgan,et al. A beamforming approach to permutation alignment for multichannel frequency-domain blind speech separation , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16] K. Matsuoka,et al. Minimal distortion principle for blind source separation , 2002, Proceedings of the 41st SICE Annual Conference. SICE 2002..

[17] Walter Kellermann,et al. Residual Cross-talk and Noise Suppression for Convolutive Blind Source Separation , 2006 .

[18] Akihiko Sugiyama,et al. Robust Adaptive Beamforming , 2001, Microphone Arrays.

[19] Jacob Benesty,et al. Analysis and Comparison of Multichannel Noise Reduction Methods in a Common Framework , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[20] Louis H. Terry,et al. Audio-Visual and Visual-Only Speech and Speaker Recognition: Issues about Theory, System Design, and Implementation , 2008 .

[21] Henry Cox,et al. Robust adaptive beamforming , 2005, IEEE Trans. Acoust. Speech Signal Process..

[22] Dinh-Tuan Pham,et al. Permutation Correction in the Frequency Domain in Blind Separation of Speech Mixtures , 2006, EURASIP J. Adv. Signal Process..

[23] Bernie Mulgrew,et al. Perceptually motivated blind source separation of convolutive audio mixtures with subspace filtering methods , 2005 .

[24] C. Fancourt,et al. The coherence function in blind source separation of convolutive mixtures of non-stationary signals , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).

[25] Özgür Yilmaz,et al. Blind separation of disjoint orthogonal signals: demixing N sources from 2 mixtures , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[26] Erkki Oja,et al. Independent component analysis: algorithms and applications , 2000, Neural Networks.

[27] Nikolaos Mitianoudis,et al. Permutation Alignment for Frequency Domain ICA Using Subspace Beamforming Methods , 2004, ICA.

[28] Paris Smaragdis,et al. Blind separation of convolved mixtures in the frequency domain , 1998, Neurocomputing.

[29] Özgür Yilmaz,et al. On the approximate W-disjoint orthogonality of speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[30] Justinian P. Rosca,et al. Generalized sparse signal mixing model and application to noisy blind source separation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[31] Marc Moonen,et al. Speech Distortion Weighted Multichannel Wiener Filtering Techniques for Noise Reduction , 2005 .

[32] O. L. Frost,et al. An algorithm for linearly constrained adaptive array processing , 1972 .

[33] Hiroshi Sawada,et al. A NOVEL BLIND SOURCE SEPARATION METHOD WITH OBSERVATION VECTOR CLUSTERING , 2005 .

[34] Saeid Sanei,et al. A Novel Hybrid Approach to the Permutation Problem of Frequency Domain Blind Source Separation , 2004, ICA.

[35] Shoko Araki,et al. Equivalence between Frequency-Domain Blind Source Separation and Frequency-Domain Adaptive Beamforming for Convolutive Mixtures , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[36] Dorothea Kolossa,et al. Nonlinear Postprocessing for Blind Speech Separation , 2004, ICA.

[37] Rainer Martin,et al. COMBINED BEAMFORMING AND FREQUENCY DOMAIN ICA FOR SOURCE SEPARATION , 2006 .

[38] Christopher V. Alvino,et al. Geometric source separation: merging convolutive source separation with geometric beamforming , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).

[39] Michael S. Brandstein,et al. Robust Localization in Reverberant Rooms , 2001, Microphone Arrays.

[40] Scott Rickard,et al. Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[41] Mohammad A. Karim,et al. Technical Challenges and Design Issues in Bangla Language Processing , 2013 .

[42] Guy J. Brown,et al. Speech segregation based on sound localization , 2003 .

[43] Kiyohiro Shikano,et al. Blind source separation based on a fast-convergence algorithm combining ICA and beamforming , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[44] Rainer Martin,et al. Cepstral Smoothing of Spectral Filter Gains for Speech Enhancement Without Musical Noise , 2007, IEEE Signal Processing Letters.

[45] Dorothea Kolossa,et al. Beamforming-based convolutive source separation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[46] S. Araki,et al. A POLAR-COORDINATE BASED ACTIVATION FUNCTION FOR FREQUENCY DOMAIN BLIND SOURCE SEPARATION , 2001 .

[47] Konstantinos I. Diamantaras,et al. Blind Source Separation Using Principal Component Neural Networks , 2001, ICANN.

[48] Jacob Benesty,et al. A Minimum Distortion Noise Reduction Algorithm With Multiple Microphones , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[49] Ehud Weinstein,et al. Signal enhancement using beamforming and nonstationarity with applications to speech , 2001, IEEE Trans. Signal Process..

[50] Dipankar Das,et al. Building Language Resources for Emotion Analysis in Bengali , 2013 .

[51] Alan Wee-Chung Liew,et al. Visual Speech Recognition: Lip Segmentation and Mapping , 2008 .

[52] L. J. Griffiths,et al. An alternative approach to linearly constrained adaptive beamforming , 1982 .