Single Channel multi-speaker speech Separation based on quantized ratio mask and residual network
暂无分享,去创建一个
Gang Li | Zhongyuan Wang | Tingzhao Wu | Shanfa Ke | Ruimin Hu | Xiaochen Wang | Gang Li | R. Hu | Zhongyuan Wang | Xiaochen Wang | Shanfa Ke | Tingzhao Wu
[1] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Kevin Wilson,et al. Looking to listen at the cocktail party , 2018, ACM Trans. Graph..
[3] Nima Mesgarani,et al. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[4] Jonathan Le Roux,et al. Single-Channel Multi-Speaker Separation Using Deep Clustering , 2016, INTERSPEECH.
[5] Yifan Gong,et al. An Overview of Noise-Robust Automatic Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[6] DeLiang Wang,et al. Complex Ratio Masking for Monaural Speech Separation , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[7] Paris Smaragdis,et al. Deep learning for monaural speech separation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] DeLiang Wang,et al. Ideal ratio mask estimation using deep neural networks for robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[9] Jonathan Le Roux,et al. Teacher-student Deep Clustering for Low-delay Single Channel Speech Separation , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] DeLiang Wang,et al. A Deep Ensemble Learning Method for Monaural Speech Separation , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[11] Ganesh Ananthanarayanan,et al. Deep recurrent neural networks based binaural speech segregation for the selection of closest target of interest , 2017, Multimedia Tools and Applications.
[12] Daniel P. W. Ellis,et al. MIR_EVAL: A Transparent Implementation of Common MIR Metrics , 2014, ISMIR.
[13] Changshui Zhang,et al. Listen and Look: Audio–Visual Matching Assisted Speech Source Separation , 2018, IEEE Signal Processing Letters.
[14] DeLiang Wang,et al. On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis , 2005, Speech Separation by Humans and Machines.
[15] Bhiksha Raj,et al. Active-Set Newton Algorithm for Overcomplete Non-Negative Representations of Audio , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[16] Dong Yu,et al. Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[17] Tim Brookes,et al. On the Ideal Ratio Mask as the Goal of Computational Auditory Scene Analysis , 2014 .
[18] DeLiang Wang,et al. On Training Targets for Supervised Speech Separation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[19] Jonathan G. Fiscus,et al. DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .
[20] DeLiang Wang,et al. Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Jesper Jensen,et al. Permutation invariant training of deep models for speaker-independent multi-talker speech separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Zhong-Qiu Wang,et al. End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction , 2018, INTERSPEECH.
[23] DeLiang Wang,et al. An algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker. , 2017, The Journal of the Acoustical Society of America.
[24] John R. Hershey,et al. Phasebook and Friends: Leveraging Discrete Representations for Source Separation , 2018, IEEE Journal of Selected Topics in Signal Processing.
[25] DeLiang Wang,et al. An Unsupervised Approach to Cochannel Speech Separation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[26] Erkki Oja,et al. Independent component analysis: algorithms and applications , 2000, Neural Networks.
[27] Rémi Gribonval,et al. Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[28] Tatsuya Kawahara,et al. Bayesian Multichannel Audio Source Separation Based on Integrated Source and Spatial Models , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[29] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[30] Zhong-Qiu Wang,et al. Integrating Spectral and Spatial Features for Multi-Channel Speaker Separation , 2018, INTERSPEECH.
[31] Daniel P. W. Ellis,et al. Model-Based Expectation-Maximization Source Separation and Localization , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[32] Marek R. Ogiela,et al. Multimedia tools and applications , 2005, Multimedia Tools and Applications.
[33] Paris Smaragdis,et al. Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[34] Nicolás Ruiz-Reyes,et al. Monophonic constrained non-negative sparse coding using instrument models for audio separation and transcription of monophonic source-based polyphonic mixtures , 2013, Multimedia Tools and Applications.
[35] Ruimin Hu,et al. Multi-speakers Speech Separation Based on Modified Attractor Points Estimation and GMM Clustering , 2019, 2019 IEEE International Conference on Multimedia and Expo (ICME).
[36] E. C. Cmm,et al. on the Recognition of Speech, with , 2008 .
[37] Yi-Hsuan Yang,et al. Complex and Quaternionic Principal Component Pursuit and Its Application to Audio Separation , 2016, IEEE Signal Processing Letters.
[38] DeLiang Wang,et al. Exploring Monaural Features for Classification-Based Speech Segregation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[39] VirtanenTuomas. Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007 .
[40] Jun Du,et al. A Regression Approach to Single-Channel Speech Separation Via High-Resolution Deep Neural Networks , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[41] GannotSharon,et al. Multichannel Speech Separation and Enhancement Using the Convolutive Transfer Function , 2019 .
[42] Tara N. Sainath,et al. Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).