Recognizing Multi-talker Speech with Permutation Invariant Training
暂无分享,去创建一个
Dong Yu | Yanmin Qian | Xuankai Chang | Dong Yu | Y. Qian | Xuankai Chang
[1] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Nima Mesgarani,et al. Deep attractor network for single-microphone speaker separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Geoffrey Zweig,et al. Achieving Human Parity in Conversational Speech Recognition , 2016, ArXiv.
[4] Jonathan Le Roux,et al. Single-Channel Multi-Speaker Separation Using Deep Clustering , 2016, INTERSPEECH.
[5] Sanjeev Khudanpur,et al. A time delay neural network architecture for efficient modeling of long temporal contexts , 2015, INTERSPEECH.
[6] Philip C. Woodland,et al. Very deep convolutional neural networks for robust speech recognition , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[7] Dong Yu,et al. Automatic Speech Recognition: A Deep Learning Approach , 2014 .
[8] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[9] Yanmin Qian,et al. Very Deep Convolutional Neural Networks for Noise Robust Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[10] Dong Yu,et al. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.
[11] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[12] Brian Kingsbury,et al. Very deep multilingual convolutional neural networks for LVCSR , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Steve Renals,et al. Hybrid acoustic models for distant and multichannel large vocabulary speech recognition , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[14] Kai Yu,et al. Very deep convolutional neural networks for LVCSR , 2015, INTERSPEECH.
[15] Gerald Penn,et al. Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[16] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[17] Jesper Jensen,et al. Permutation invariant training of deep models for speaker-independent multi-talker speech separation , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Dong Yu,et al. Roles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition , 2010 .
[19] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[20] John R. Hershey,et al. Monaural speech separation and recognition challenge , 2010, Comput. Speech Lang..
[21] Yu Hu,et al. Feedforward Sequential Memory Networks: A New Structure to Learn Long-term Dependency , 2015, ArXiv.
[22] References , 1971 .
[23] Gerald Penn,et al. Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Tara N. Sainath,et al. Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Michael I. Jordan,et al. Factorial Hidden Markov Models , 1995, Machine Learning.
[26] Dong Yu,et al. Multi-talker Speech Separation and Tracing with Permutation Invariant Training of Deep Recurrent Neural Networks , 2017, ArXiv.
[27] Dong Yu,et al. Deep Neural Networks for Single-Channel Multi-Talker Speech Recognition , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[28] Horacio Franco,et al. Time-frequency convolutional networks for robust speech recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[29] Lukás Burget,et al. Transcribing Meetings With the AMIDA Systems , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[30] Geoffrey Zweig,et al. An introduction to computational networks and the computational network toolkit (invited talk) , 2014, INTERSPEECH.
[31] Geoffrey Zweig,et al. Deep Convolutional Neural Networks with Layer-Wise Context Expansion and Attention , 2016, INTERSPEECH.