Bayesian Learning for Deep Neural Network Adaptation
暂无分享,去创建一个
Xunying Liu | Tan Lee | Xurong Xie | Lan Wang | Tan Lee | Xunying Liu | Lan Wang | Xurong Xie
[1] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.
[2] Peter Bell,et al. Learning to adapt: a meta-learning approach for speaker adaptation , 2018, Interspeech 2018.
[3] Chin-Hui Lee,et al. A unified approach to transfer learning of deep neural networks with applications to speaker adaptation in automatic speech recognition , 2016, Neurocomputing.
[4] Yifan Gong,et al. Low-rank plus diagonal adaptation for deep neural networks , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[6] Steve Renals,et al. Learning Hidden Unit Contributions for Unsupervised Acoustic Model Adaptation , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[7] Navdeep Jaitly,et al. Vocal Tract Length Perturbation (VTLP) improves speech recognition , 2013 .
[8] Jianwei Yu,et al. LF-MMI Training of Bayesian and Gaussian Process Time Delay Neural Networks for Speech Recognition , 2019, INTERSPEECH.
[9] Yifan Gong,et al. Using Personalized Speech Synthesis and Neural Language Generator for Rapid Speaker Adaptation , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Li Lee,et al. Speaker normalization using efficient frequency warping procedures , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[11] Xiaohui Zhang,et al. Parallel training of Deep Neural Networks with Natural Gradient and Parameter Averaging , 2014, ICLR.
[12] Charles M. Bishop,et al. Ensemble learning in Bayesian neural networks , 1998 .
[13] Steve Renals,et al. Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[14] Mark J. F. Gales,et al. Multi-basis adaptive neural network for rapid adaptation in speech recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Stephen Cox,et al. Some statistical issues in the comparison of speech recognition algorithms , 1989, International Conference on Acoustics, Speech, and Signal Processing,.
[16] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[17] Lukás Burget,et al. Sequence-discriminative training of deep neural networks , 2013, INTERSPEECH.
[18] Chin-Hui Lee,et al. Hierarchical Bayesian combination of plug-in maximum a posteriori decoders in deep neural networks-based speech recognition and speaker adaptation , 2017, Pattern Recognit. Lett..
[19] Shoukang Hu,et al. BLHUC: Bayesian Learning of Hidden Unit Contributions for Deep Neural Network Speaker Adaptation , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Xiaofei Wang,et al. A Comparative Study on Transformer vs RNN in Speech Applications , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[21] Peter Bell,et al. Improving Children's Speech Recognition Through Out-of-Domain Data Augmentation , 2016, INTERSPEECH.
[22] Richard Socher,et al. An Investigation of Phone-Based Subword Units for End-to-End Speech Recognition , 2020, INTERSPEECH.
[23] Steve J. Young,et al. MMI training for continuous phoneme recognition on the TIMIT database , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[24] Tetsuji Ogawa,et al. Speaker Invariant Feature Extraction for Zero-Resource Languages with Adversarial Learning , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Li Deng,et al. Sequence classification using the high-level features extracted from deep neural networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Geoffrey Zweig,et al. LATTICE-BASED UNSUPERVISED MLLR FOR SPEAKER ADAPTATION , 2000 .
[27] Kai Yu,et al. Cluster Adaptive Training for Deep Neural Network Based Acoustic Model , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[28] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Hermann Ney,et al. Cumulative Adaptation for BLSTM Acoustic Models , 2019, INTERSPEECH.
[30] Hermann Ney,et al. LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.
[31] SwietojanskiPawel,et al. Learning hidden unit contributions for unsupervised acoustic model adaptation , 2016 .
[32] Sanjeev Khudanpur,et al. Parallel training of DNNs with Natural Gradient and Parameter Averaging , 2014 .
[33] George Saon,et al. Speaker adaptation of neural network acoustic models using i-vectors , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[34] Richard M. Schwartz,et al. A compact model for speaker-adaptive training , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[35] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..
[36] C. Zhang,et al. DNN speaker adaptation using parameterised sigmoid and ReLU hidden activation functions , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[37] J. Becker,et al. The natural history of Alzheimer's disease. Description of study cohort and accuracy of diagnosis. , 1994, Archives of neurology.
[38] Khe Chai Sim,et al. Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems , 2010, INTERSPEECH.
[39] Sanjeev Khudanpur,et al. Audio augmentation for speech recognition , 2015, INTERSPEECH.
[40] Hermann Ney,et al. RWTH ASR Systems for LibriSpeech: Hybrid vs Attention - w/o Data Augmentation , 2019, INTERSPEECH.
[41] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[42] Gerald Penn,et al. Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[43] Khe Chai Sim,et al. An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models , 2019, INTERSPEECH.
[44] Vassilios Digalakis,et al. Speaker adaptation using constrained estimation of Gaussian mixtures , 1995, IEEE Trans. Speech Audio Process..
[45] Dong Yu,et al. Exploring convolutional neural network structures and optimization techniques for speech recognition , 2013, INTERSPEECH.
[46] Sanjeev Khudanpur,et al. End-to-end Speech Recognition Using Lattice-free MMI , 2018, INTERSPEECH.
[47] I-Fan Chen,et al. Maximum a posteriori adaptation of network parameters in deep models , 2015, INTERSPEECH.
[48] Ciro Martins,et al. Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system , 1995, EUROSPEECH.
[49] Yifan Gong,et al. Acoustic Model Adaptation for Presentation Transcription and Intelligent Meeting Assistant Systems , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[50] David J. C. MacKay,et al. A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.
[51] Andrew W. Senior,et al. Long short-term memory recurrent neural network architectures for large scale acoustic modeling , 2014, INTERSPEECH.
[52] Yifan Gong,et al. Speaker Adaptation for End-to-End CTC Models , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[53] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[54] Li-Rong Dai,et al. Fast Adaptation of Deep Neural Network Based on Discriminant Codes for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[55] Diederik P. Kingma,et al. Stochastic Gradient VB and the Variational Auto-Encoder , 2013 .
[56] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .
[57] Alan F. Murray,et al. Enhanced MLP performance and fault tolerance resulting from synaptic weight noise during training , 1994, IEEE Trans. Neural Networks.
[58] Sanjeev Khudanpur,et al. A time delay neural network architecture for efficient modeling of long temporal contexts , 2015, INTERSPEECH.
[59] Hank Liao,et al. Speaker adaptation of context dependent deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[60] Yiming Wang,et al. Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI , 2016, INTERSPEECH.
[61] Biing-Hwang Juang,et al. Speaker-Invariant Training Via Adversarial Learning , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[62] Mark J. F. Gales. Cluster adaptive training of hidden Markov models , 2000, IEEE Trans. Speech Audio Process..
[63] John J. Godfrey,et al. SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[64] Shih-Chii Liu,et al. Parameter Uncertainty for End-to-end Speech Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[65] Dong Yu,et al. Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[66] Daniel Povey,et al. Minimum Phone Error and I-smoothing for improved discriminative training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[67] Jen-Tzung Chien,et al. Bayesian Recurrent Neural Network for Language Modeling , 2016, IEEE Transactions on Neural Networks and Learning Systems.
[68] Kaisheng Yao,et al. KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[69] Geoffrey Zweig,et al. Transformer-Based Acoustic Modeling for Hybrid Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[70] Rohit Prabhavalkar,et al. Exploring architectures, data and units for streaming end-to-end speech recognition with RNN-transducer , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[71] Jianwei Yu,et al. Bayesian and Gaussian Process Neural Networks for Large Vocabulary Continuous Speech Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[72] Khe Chai Sim,et al. Subspace LHUC for Fast Adaptation of Deep Neural Network Acoustic Models , 2016, INTERSPEECH.
[73] Andrew W. Senior,et al. Improving DNN speaker independence with I-vector inputs , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[74] S. M. Siniscalchi,et al. Hermitian Polynomial for Speaker Adaptation of Connectionist Speech Recognition Systems , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[75] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[76] Pietro Laface,et al. Linear hidden transformations for adaptation of hybrid ANN/HMM models , 2007, Speech Commun..
[77] Mark J. F. Gales,et al. The Cambridge University 2014 BOLT conversational telephone Mandarin Chinese LVCSR system for speech translation , 2015, INTERSPEECH.
[78] Alex Graves,et al. Practical Variational Inference for Neural Networks , 2011, NIPS.
[79] Chin-Hui Lee,et al. Bayesian Unsupervised Batch and Online Speaker Adaptation of Activation Function Parameters in Deep Models for Automatic Speech Recognition , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[80] Brian Kingsbury,et al. Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard-300 , 2020, INTERSPEECH.
[81] Chao Zhang,et al. Parameterised sigmoid and reLU hidden activation functions for DNN acoustic modelling , 2015, INTERSPEECH.
[82] Philip C. Woodland,et al. An investigation into vocal tract length normalisation , 1999, EUROSPEECH.