Offline to online speaker adaptation for real-time deep neural network based LVCSR systems
暂无分享,去创建一个
[1] Yongqiang Wang,et al. Investigations on speaker adaptation of LSTM RNN models for speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[4] Hank Liao,et al. Speaker adaptation of context dependent deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[5] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[6] Xiaohui Zhang,et al. Parallel training of Deep Neural Networks with Natural Gradient and Parameter Averaging , 2014, ICLR.
[7] Douglas A. Reynolds,et al. A unified deep neural network for speaker and language recognition , 2015, INTERSPEECH.
[8] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..
[9] Daniel Garcia-Romero,et al. Time delay deep neural network-based universal background models for speaker recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[10] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[11] Themos Stafylakis,et al. Deep Neural Networks for extracting Baum-Welch statistics for Speaker Recognition , 2014, Odyssey.
[12] Li-Rong Dai,et al. Fast Adaptation of Deep Neural Network Based on Discriminant Codes for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[13] Li-Rong Dai,et al. Speaker adaptation OF RNN-BLSTM for speech recognition based on speaker code , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Dong Yu,et al. Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[15] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[16] Chin-Hui Lee,et al. Bayesian Learning of Gaussian Mixture Densities for Hidden Markov Models , 1991, HLT.
[17] C. Zhang,et al. DNN speaker adaptation using parameterised sigmoid and ReLU hidden activation functions , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Yun Lei,et al. A novel scheme for speaker recognition using a phonetically-aware deep neural network , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Li-Rong Dai,et al. Speaker Adaptation of Hybrid NN/HMM Model for Speech Recognition Based on Singular Value Decomposition , 2014, Journal of Signal Processing Systems.
[20] Sanjeev Khudanpur,et al. A time delay neural network architecture for efficient modeling of long temporal contexts , 2015, INTERSPEECH.
[21] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[22] Florin Curelaru,et al. Front-End Factor Analysis For Speaker Verification , 2018, 2018 International Conference on Communications (COMM).
[23] John H. L. Hansen,et al. Duration mismatch compensation for i-vector based speaker recognition systems , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[24] Yifan Gong,et al. Low-rank plus diagonal adaptation for deep neural networks , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Patrick Kenny,et al. Eigenvoice modeling with sparse training data , 2005, IEEE Transactions on Speech and Audio Processing.
[26] Kaisheng Yao,et al. KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[27] Yiming Wang,et al. Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI , 2016, INTERSPEECH.
[28] Andrew W. Senior,et al. Fast and accurate recurrent neural network acoustic models for speech recognition , 2015, INTERSPEECH.
[29] Souvik Kundu,et al. Speaker-aware training of LSTM-RNNS for acoustic modelling , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[30] Philip C. Woodland,et al. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..
[31] Yifan Gong,et al. Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Dong Yu,et al. Automatic Speech Recognition: A Deep Learning Approach , 2014 .
[33] Patrick Kenny,et al. Mixture of PLDA Models in i-vector Space for Gender-Independent Speaker Recognition , 2011, INTERSPEECH.
[34] Florian Metze,et al. Speaker Adaptive Training of Deep Neural Network Acoustic Models Using I-Vectors , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[35] Andrew W. Senior,et al. Improving DNN speaker independence with I-vector inputs , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[36] Steve Renals,et al. Differentiable pooling for unsupervised speaker adaptation , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[37] Sanjeev Khudanpur,et al. Acoustic Modelling from the Signal Domain Using CNNs , 2016, INTERSPEECH.
[38] Tara N. Sainath,et al. Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Sree Hari Krishnan Parthasarathi,et al. Robust i-vector based adaptation of DNN acoustic model for speech recognition , 2015, INTERSPEECH.
[40] Sanjeev Khudanpur,et al. JHU ASpIRE system: Robust LVCSR with TDNNS, iVector adaptation and RNN-LMS , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[41] Tara N. Sainath,et al. Acoustic modelling with CD-CTC-SMBR LSTM RNNS , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[42] Geoffrey E. Hinton,et al. Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..
[43] George Saon,et al. Speaker adaptation of neural network acoustic models using i-vectors , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.