Permutation Invariant Training of Generative Adversarial Network for Monaural Speech Separation
暂无分享,去创建一个
Dong Yu | Dan Su | Meng Yu | Yanmin Qian | Lianwu Chen | Dong Yu | Y. Qian | Dan Su | Meng Yu | Lianwu Chen
[1] Antonio Bonafonte,et al. SEGAN: Speech Enhancement Generative Adversarial Network , 2017, INTERSPEECH.
[2] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[3] Jesper Jensen,et al. Permutation invariant training of deep models for speaker-independent multi-talker speech separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Rémi Gribonval,et al. Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[5] Dong Yu,et al. Past review, current progress, and challenges ahead on the cocktail party problem , 2018, Frontiers of Information Technology & Electronic Engineering.
[6] Guy J. Brown,et al. Computational auditory scene analysis , 1994, Comput. Speech Lang..
[7] David Malah,et al. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..
[8] Jonathan Le Roux,et al. Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] E. C. Cmm,et al. on the Recognition of Speech, with , 2008 .
[10] Chris Donahue,et al. Exploring Speech Enhancement with Generative Adversarial Networks for Robust Speech Recognition , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Guy J. Brown,et al. Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006 .
[12] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[13] Björn W. Schuller,et al. Non-negative matrix factorization as noise-robust feature extractor for speech recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[14] Dong Yu,et al. Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[15] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Daniel P. W. Ellis,et al. Speech enhancement by low-rank and convolutive dictionary spectrogram decomposition , 2014, INTERSPEECH.