Learning Hidden Unit Contributions for Unsupervised Acoustic Model Adaptation
暂无分享,去创建一个
[1] Jean Carletta,et al. Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting Corpus , 2007, Lang. Resour. Evaluation.
[2] Steve Renals,et al. Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[3] Yongqiang Wang,et al. An investigation of deep neural networks for noise robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[4] Yifan Gong,et al. Regularized sequence-level deep neural network model adaptation , 2015, INTERSPEECH.
[5] Koichi Shinoda,et al. Speaker adaptation of deep neural networks using a hierarchy of output layers , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[6] Chao Zhang,et al. Parameterised sigmoid and reLU hidden activation functions for DNN acoustic modelling , 2015, INTERSPEECH.
[7] Hui Jiang,et al. Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[8] Jonathan G. Fiscus,et al. 2000 NIST EVALUATION OF CONVERSATIONAL SPEECH RECOGNITION OVER THE TELEPHONE: ENGLISH AND MANDAR IN PERFORMANCE RESULTS , 2000 .
[9] Hervé Bourlard,et al. Connectionist probability estimators in HMM speech recognition , 1994, IEEE Trans. Speech Audio Process..
[10] Vaibhava Goel,et al. Annealed dropout training of deep networks , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[11] Steve Renals,et al. Convolutional Neural Networks for Distant Speech Recognition , 2014, IEEE Signal Processing Letters.
[12] Ji Wu,et al. Rapid adaptation for deep neural networks through multi-task learning , 2015, INTERSPEECH.
[13] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.
[14] Thomas Hain,et al. Recognition and understanding of meetings the AMI and AMIDA projects , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).
[15] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[16] Steve Renals,et al. Multi-level adaptive networks in tandem and hybrid ASR systems , 2013, ICASSP.
[17] Mark J. F. Gales,et al. Multi-basis adaptive neural network for rapid adaptation in speech recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Sebastian Stüker,et al. Overview of the IWSLT 2012 evaluation campaign , 2012, IWSLT.
[19] Steve Renals,et al. Differentiable pooling for unsupervised speaker adaptation , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Kaisheng Yao,et al. KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[21] Kai Yu,et al. Cluster adaptive training for deep neural network , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Tara N. Sainath,et al. Improvements to Deep Convolutional Neural Networks for LVCSR , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[23] Jan Cernocký,et al. Probabilistic and Bottle-Neck Features for LVCSR of Meetings , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[24] George Saon,et al. Speaker adaptation of neural network acoustic models using i-vectors , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[25] Tara N. Sainath,et al. Deep convolutional neural networks for LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[26] Peter Bell,et al. Structured output layer with auxiliary targets for context-dependent acoustic modelling , 2015, INTERSPEECH.
[27] Yifan Gong,et al. Factorized adaptation for deep neural network , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.
[29] Naveen Parihar,et al. Performance analysis of the Aurora large vocabulary baseline system , 2004, 2004 12th European Signal Processing Conference.
[30] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[31] Edmondo Trentin,et al. Networks with trainable amplitude of activation functions , 2001, Neural Networks.
[32] Yongqiang Wang,et al. Adaptation of deep neural network acoustic models using factorised i-vectors , 2014, INTERSPEECH.
[33] Daniel P. W. Ellis,et al. Tandem connectionist feature extraction for conventional HMM systems , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[34] Zdravko Kacic,et al. A novel loss function for the overall risk criterion based discriminative training of HMM models , 2000, INTERSPEECH.
[35] I-Fan Chen,et al. Maximum a posteriori adaptation of network parameters in deep models , 2015, INTERSPEECH.
[36] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[37] Hui Jiang,et al. Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition , 2013, INTERSPEECH.
[38] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..
[39] Fergus McInnes,et al. The UEDIN ASR Systems for the IWSLT 2014 Evaluation , 2014 .
[40] Kurt Hornik,et al. Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.
[41] Florian Metze,et al. Speaker Adaptive Training of Deep Neural Network Acoustic Models Using I-Vectors , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[42] Steve Renals,et al. Connectionist probability estimation in the DECIPHER speech recognition system , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[43] Themos Stafylakis,et al. I-vector-based speaker adaptation of deep neural networks for French broadcast audio transcription , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[44] Richard M. Schwartz,et al. A compact model for speaker-adaptive training , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[45] Kaisheng Yao,et al. Adaptation of context-dependent deep neural networks for automatic speech recognition , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[46] Tara N. Sainath,et al. Auto-encoder bottleneck features using deep belief networks , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[47] Steve Renals,et al. Hybrid acoustic models for distant and multichannel large vocabulary speech recognition , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[48] Khe Chai Sim,et al. On combining i-vectors and discriminative adaptation methods for unsupervised speaker normalization in DNN acoustic models , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[49] Khe Chai Sim,et al. Learning factorized feature transforms for speaker normalization , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[50] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[51] Lukás Burget,et al. iVector-based discriminative adaptation for automatic speech recognition , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[52] Stephen Cox,et al. RecNorm: Simultaneous Normalisation and Classification Applied to Speech Recognition , 1990, NIPS.
[53] Tara N. Sainath,et al. Deep Belief Networks using discriminative features for phone recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[54] Khe Chai Sim,et al. Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems , 2010, INTERSPEECH.
[55] Zhizheng Wu,et al. Human vs machine spoofing detection on wideband and narrowband data , 2015, INTERSPEECH.
[56] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .
[57] Li-Rong Dai,et al. Fast Adaptation of Deep Neural Network Based on Discriminant Codes for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[58] John J. Godfrey,et al. SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[59] Steve Renals,et al. Revisiting hybrid and GMM-HMM system combination techniques , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[60] Thomas Hain,et al. Using neural network front-ends on far field multiple microphones based speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[61] Philip C. Woodland. Speaker adaptation for continuous density HMMs: a review , 2001 .
[62] Hank Liao,et al. Speaker adaptation of context dependent deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[63] Mark J. F. Gales,et al. Cambridge university transcription systems for the multi-genre broadcast challenge , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[64] Tara N. Sainath,et al. FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .
[65] Lukás Burget,et al. Sequence-discriminative training of deep neural networks , 2013, INTERSPEECH.
[66] Steve Renals,et al. SAT-LHUC: Speaker adaptive training for learning hidden unit contributions , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[67] S. M. Siniscalchi,et al. Hermitian Polynomial for Speaker Adaptation of Connectionist Speech Recognition Systems , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[68] Mauro Cettolo,et al. WIT3: Web Inventory of Transcribed and Translated Talks , 2012, EAMT.
[69] Lukás Burget,et al. Transcribing Meetings With the AMIDA Systems , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[70] Brian Kingsbury,et al. Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[71] Gerald Penn,et al. Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[72] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[73] Shigeru Katagiri,et al. Speaker Adaptive Training using Deep Neural Networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[74] Tomohiro Nakatani,et al. Context adaptive deep neural networks for fast acoustic model adaptation , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[75] R. French,et al. Catastrophic Forgetting in Connectionist Networks: Causes, Consequences and Solutions , 1994 .
[76] Hideki Kashioka,et al. The NICT ASR system for IWSLT2011 , 2011, IWSLT.
[77] Dong Yu,et al. Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[78] Petr Motlícek,et al. Towards utterance-based neural network adaptation in acoustic modeling , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[79] Yifan Gong,et al. Investigating online low-footprint speaker adaptation using generalized linear regression and click-through data , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[80] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[81] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.
[82] Ciro Martins,et al. Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system , 1995, EUROSPEECH.
[83] Yifan Gong,et al. Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[84] Horacio Franco,et al. Connectionist speaker normalization and adaptation , 1995, EUROSPEECH.
[85] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.