Improvements to speaker adaptive training of deep neural networks
暂无分享,去创建一个
Florian Metze | Hao Zhang | Yajie Miao | Lu Jiang | Florian Metze | Lu Jiang | Y. Miao | Hao Zhang
[1] Juergen Luettin,et al. Audio-Visual Speech Modeling for Continuous Speech Recognition , 2000, IEEE Trans. Multim..
[2] Chalapathy Neti,et al. Asynchrony modeling for audio-visual speech recognition , 2002 .
[3] Jonathan Le Roux,et al. Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[4] Nicholas W. D. Evans,et al. ALIZE/spkdet: a state-of-the-art open source software for speaker recognition , 2008, Odyssey.
[5] Patrick Kenny,et al. Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification , 2009, INTERSPEECH.
[6] Khe Chai Sim,et al. Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems , 2010, INTERSPEECH.
[7] Pascal Vincent,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..
[8] Lukás Burget,et al. iVector-based discriminative adaptation for automatic speech recognition , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[9] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[10] Dong Yu,et al. Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[11] Kaisheng Yao,et al. Adaptation of context-dependent deep neural networks for automatic speech recognition , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[12] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[13] Keikichi Hirose,et al. Audio-visual feature integration based on piecewise linear transformation for noise robust automatic speech recognition , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[14] Gerald Penn,et al. Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Navdeep Jaitly,et al. Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition , 2012, INTERSPEECH.
[16] Jinyu Li,et al. Hermitian based Hidden Activation Functions for Adaptation of Hybrid HMM/ANN Models , 2012, INTERSPEECH.
[17] Yifan Gong,et al. Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[18] Hank Liao,et al. Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[19] Florian Metze,et al. Extracting deep bottleneck features using stacked auto-encoders , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[20] Tara N. Sainath,et al. Deep convolutional neural networks for LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[21] Jan Cernocký,et al. Improved feature processing for deep neural networks , 2013, INTERSPEECH.
[22] Tara N. Sainath,et al. Improvements to Deep Convolutional Neural Networks for LVCSR , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[23] Hui Jiang,et al. Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[24] Wonkyum Lee,et al. Modular combination of deep neural networks for acoustic modeling , 2013, INTERSPEECH.
[25] George Saon,et al. Speaker adaptation of neural network acoustic models using i-vectors , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[26] Florian Metze,et al. Deep maxout networks for low-resource speech recognition , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[27] Yajie Miao,et al. Kaldi+PDNN: Building DNN-based ASR Systems with Kaldi and PDNN , 2014, ArXiv.
[28] Florian Metze,et al. Towards speaker adaptive training of deep neural network acoustic models , 2014, INTERSPEECH.
[29] Florian Metze,et al. Distributed learning of multilingual DNN feature extractors using GPUs , 2014, INTERSPEECH.
[30] Florian Metze,et al. Improving language-universal feature extraction with deep maxout and convolutional neural networks , 2014, INTERSPEECH.