Adaptation of Deep Neural Network Acoustic Models for Robust Automatic Speech Recognition
暂无分享,去创建一个
[1] E. A. Martin,et al. Multi-style training for robust isolated-word speech recognition , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[2] Khe Chai Sim,et al. Noise adaptive front-end normalization based on Vector Taylor Series for Deep Neural Networks in robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[3] Ngoc Thang Vu,et al. Multilingual bottle-neck features and its application for under-resourced languages , 2012, SLTU.
[4] Khe Chai Sim,et al. Factorized Hidden Layer Adaptation for Deep Neural Network Based Acoustic Modeling , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[5] Dong Yu,et al. Neural Network Based Multi-Factor Aware Joint Training for Robust Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[6] Gerhard Rigoll,et al. Two-stage speaker adaptation of hybrid tied-posterior acoustic models , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[7] Jan Cernocký,et al. Probabilistic and Bottle-Neck Features for LVCSR of Meetings , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[8] George Saon,et al. Speaker adaptation of neural network acoustic models using i-vectors , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[9] Jun Du,et al. An Experimental Study on Speech Enhancement Based on Deep Neural Networks , 2014, IEEE Signal Processing Letters.
[10] Ebru Arisoy,et al. Low-rank matrix factorization for Deep Neural Network training with high-dimensional output targets , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[11] Mark J. F. Gales. Cluster adaptive training of hidden Markov models , 2000, IEEE Trans. Speech Audio Process..
[12] Yu Tsao,et al. Speech enhancement based on deep denoising autoencoder , 2013, INTERSPEECH.
[13] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[14] Hui Jiang,et al. Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[15] Georg Heigold,et al. A Gaussian Mixture Model layer jointly optimized with discriminative features within a Deep Neural Network architecture , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Li-Rong Dai,et al. Direct adaptation of hybrid DNN/HMM model for fast speaker adaptation in LVCSR based on speaker code , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Hank Liao,et al. Speaker adaptation of context dependent deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[18] Dong Yu,et al. An investigation into using parallel data for far-field speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Thomas Hain,et al. An investigation into speaker informed DNN front-end for LVCSR , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Khe Chai Sim. On constructing and analysing an interpretable brain model for the DNN based on hidden activity patterns , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[21] Florian Metze,et al. Towards speaker adaptive training of deep neural network acoustic models , 2014, INTERSPEECH.
[22] Mark J. F. Gales,et al. Improving the interpretability of deep neural networks with stimulated learning , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[23] Martin Karafiát,et al. The language-independent bottleneck features , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[24] Xiaodong Cui,et al. Data Augmentation for Deep Neural Network Acoustic Modeling , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[25] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[26] Mark J. F. Gales,et al. I-vector estimation using informative priors for adaptation of deep neural networks , 2015, INTERSPEECH.
[27] Dong Yu,et al. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.
[28] Jasha Droppo,et al. Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[30] Dong Yu,et al. Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[31] Steve Renals,et al. Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[32] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[33] Khe Chai Sim,et al. An investigation of augmenting speaker representations to improve speaker normalisation for DNN-based speech recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] Khe Chai Sim,et al. Subspace LHUC for Fast Adaptation of Deep Neural Network Acoustic Models , 2016, INTERSPEECH.
[35] Steve Renals,et al. SAT-LHUC: Speaker adaptive training for learning hidden unit contributions , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[36] Yongqiang Wang,et al. Adaptation of deep neural network acoustic models using factorised i-vectors , 2014, INTERSPEECH.
[37] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..
[38] Tomohiro Nakatani,et al. Context adaptive deep neural networks for fast acoustic model adaptation , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Kai Yu,et al. Cluster Adaptive Training for Deep Neural Network Based Acoustic Model , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[40] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[41] Kaisheng Yao,et al. KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[42] Kai Yu,et al. Cluster adaptive training for deep neural network , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[43] Richard M. Stern,et al. Gammatone sub-band magnitude-domain dereverberation for ASR , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[44] Yonghong Yan,et al. A Initial Attempt on Task-Specific Adaptation for Deep Neural Network-based Large Vocabulary Continuous Speech Recognition , 2012, INTERSPEECH.
[45] Chengzhu Yu,et al. Context adaptive deep neural networks for fast acoustic model adaptation in noisy conditions , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[46] Khe Chai Sim,et al. Temporally Varying Weight Regression: A Semi-Parametric Trajectory Model for Automatic Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[47] Kai Yu,et al. Multi-task joint-learning of deep neural networks for robust speech recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[48] Florian Metze,et al. Distance-aware DNNs for robust speech recognition , 2015, INTERSPEECH.
[49] Andrew W. Senior,et al. Improving DNN speaker independence with I-vector inputs , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[50] Kaisheng Yao,et al. Adaptation of context-dependent deep neural networks for automatic speech recognition , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[51] Chin-Hui Lee,et al. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..
[52] Khe Chai Sim,et al. On combining i-vectors and discriminative adaptation methods for unsupervised speaker normalization in DNN acoustic models , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[53] Patrick Kenny,et al. Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification , 2009, INTERSPEECH.
[54] G. Carter,et al. The generalized correlation method for estimation of time delay , 1976 .
[55] Yifan Gong,et al. Restructuring of deep neural network acoustic models with singular value decomposition , 2013, INTERSPEECH.
[56] Lukás Burget,et al. Transcribing Meetings With the AMIDA Systems , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[57] Yu Zhang,et al. Speech recognition with prediction-adaptation-correction recurrent neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[58] Tomohiro Nakatani,et al. Making Machines Understand Us in Reverberant Rooms: Robustness Against Reverberation for Automatic Speech Recognition , 2012, IEEE Signal Process. Mag..
[59] Yasuo Horiuchi,et al. Reverberant speech recognition based on denoising autoencoder , 2013, INTERSPEECH.
[60] Tasha Nagamine,et al. Exploring how deep neural networks form phonemic categories , 2015, INTERSPEECH.
[61] Pietro Laface,et al. Adaptation of Hybrid ANN/HMM Models Using Linear Hidden Transformations and Conservative Training , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[62] Steve Renals,et al. Hybrid acoustic models for distant and multichannel large vocabulary speech recognition , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[63] Khe Chai Sim,et al. Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems , 2010, INTERSPEECH.
[64] Mark J. F. Gales,et al. Combining i-vector representation and structured neural networks for rapid adaptation , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[65] Kaisheng Yao,et al. Intermediate-layer DNN adaptation for offline and session-based iterative speaker adaptation , 2015, INTERSPEECH.
[66] Khe Chai Sim,et al. Joint adaptation and adaptive training of TVWR for robust automatic speech recognition , 2014, INTERSPEECH.
[67] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[68] Li-Rong Dai,et al. Fast Adaptation of Deep Neural Network Based on Discriminant Codes for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[69] Mark J. F. Gales,et al. Stimulated Deep Neural Network for Speech Recognition , 2016, INTERSPEECH.
[70] Mark J. F. Gales,et al. Multi-basis adaptive neural network for rapid adaptation in speech recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[71] Florian Metze,et al. Improvements to speaker adaptive training of deep neural networks , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[72] Themos Stafylakis,et al. I-vector-based speaker adaptation of deep neural networks for French broadcast audio transcription , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[73] Patrick Kenny,et al. A Study of Interspeaker Variability in Speaker Verification , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[74] Yongqiang Wang,et al. An investigation of deep neural networks for noise robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[75] Li-Rong Dai,et al. A Regression Approach to Speech Enhancement Based on Deep Neural Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[76] Martin Karafiát,et al. Study of probabilistic and Bottle-Neck features in multilingual environment , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[77] Philip C. Woodland,et al. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..
[78] Ciro Martins,et al. Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system , 1995, EUROSPEECH.
[79] Yifan Gong,et al. Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[80] Marc Delcroix,et al. Joint acoustic factor learning for robust deep neural network based automatic speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[81] Sanjeev Khudanpur,et al. Reverberation robust acoustic modeling using i-vectors with time delay neural networks , 2015, INTERSPEECH.
[82] Dong Yu,et al. Automatic Speech Recognition: A Deep Learning Approach , 2014 .
[83] Li-Rong Dai,et al. Speaker Adaptation of Hybrid NN/HMM Model for Speech Recognition Based on Singular Value Decomposition , 2016, J. Signal Process. Syst..
[84] Richard M. Stern,et al. A vector Taylor series approach for environment-independent speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[85] Yu Zhang,et al. Integrated adaptation with multi-factor joint-learning for far-field speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[86] Frantisek Grézl,et al. Optimizing bottle-neck features for lvcsr , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[87] Jan Zelinka,et al. Adaptation of a Feedforward Artificial Neural Network Using a Linear Transform , 2010, TSD.