On quantifying the quality of acoustic models in hybrid DNN-HMM ASR
暂无分享,去创建一个
[1] Stephen Cox,et al. Some statistical issues in the comparison of speech recognition algorithms , 1989, International Conference on Acoustics, Speech, and Signal Processing,.
[2] Hynek Hermansky,et al. Analysis of MLP-Based Hierarchical Phoneme Posterior Probability Estimator , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[3] S. J. Young,et al. Tree-based state tying for high accuracy acoustic modelling , 1994 .
[4] H. Bourlard,et al. Links Between Markov Models and Multilayer Perceptrons , 1990, IEEE Trans. Pattern Anal. Mach. Intell..
[5] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Jeff A. Bilmes,et al. What HMMs Can Do , 2006, IEICE Trans. Inf. Syst..
[7] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[8] F. Jelinek,et al. Continuous speech recognition by statistical methods , 1976, Proceedings of the IEEE.
[9] Jonathan G. Fiscus,et al. Tools for the analysis of benchmark speech recognition tests , 1990, International Conference on Acoustics, Speech, and Signal Processing.
[10] Larry Gillick,et al. Discriminative training for speech recognition is compensating for statistical dependence in the HMM framework , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Suman V. Ravuri,et al. How Neural Network Depth Compensates for HMM Conditional Independence Assumptions in DNN-HMM Acoustic Models , 2016, INTERSPEECH.
[12] Lukás Burget,et al. Sequence-discriminative training of deep neural networks , 2013, INTERSPEECH.
[13] Tasha Nagamine,et al. Exploring how deep neural networks form phonemic categories , 2015, INTERSPEECH.
[14] Geoffrey E. Hinton,et al. Understanding how Deep Belief Networks perform acoustic modelling , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[16] Hervé Bourlard,et al. Exploiting low-dimensional structures to enhance DNN based acoustic modeling in speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Hervé Bourlard,et al. Sparse modeling of neural network posterior probabilities for exemplar-based speech recognition , 2016, Speech Commun..
[18] Frank Fallside,et al. A recurrent error propagation network speech recognition system , 1991 .
[19] Hervé Bourlard,et al. Exploiting Eigenposteriors for Semi-Supervised Training of DNN Acoustic Models with Sequence Discrimination , 2017, INTERSPEECH.
[20] Andrew W. Senior,et al. Long short-term memory recurrent neural network architectures for large scale acoustic modeling , 2014, INTERSPEECH.
[21] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[22] Samy Bengio,et al. Developing and enhancing posterior based speech recognition systems , 2005, INTERSPEECH.
[23] Frederick Jelinek,et al. Statistical methods for speech recognition , 1997 .
[24] Sanjeev Khudanpur,et al. A time delay neural network architecture for efficient modeling of long temporal contexts , 2015, INTERSPEECH.
[25] Yifan Gong,et al. A comparative analytic study on the Gaussian mixture and context dependent deep neural network hidden Markov models , 2014, INTERSPEECH.
[26] Mari Ostendorf,et al. From HMM's to segment models: a unified view of stochastic modeling for speech recognition , 1996, IEEE Trans. Speech Audio Process..
[27] Jeff A. Bilmes,et al. Maximum mutual information based reduction strategies for cross-correlation based joint distributional modeling , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[28] Hervé Bourlard,et al. Low-Rank Representation of Nearest Neighbor Phone Posterior Probabilities to Enhance DNN Acoustic Modeling , 2016, Interspeech 2016.
[29] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[30] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .
[31] Hervé Bourlard,et al. Low-rank and sparse soft targets to learn better DNN acoustic models , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Hynek Hermansky,et al. Exploiting contextual information for improved phoneme recognition , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[33] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[34] Hynek Hermansky,et al. Temporal patterns (TRAPs) in ASR of noisy speech , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[35] Jeff A. Bilmes,et al. WHAT HMMS CAN'T DO , 2004 .
[36] Geoffrey E. Hinton,et al. Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..
[37] Daniel P. W. Ellis,et al. Tandem connectionist feature extraction for conventional HMM systems , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[38] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[39] Yifan Gong,et al. Learning small-size DNN with output-distribution-based criteria , 2014, INTERSPEECH.
[40] Larry Gillick,et al. Don't multiply lightly: Quantifying problems with the acoustic model assumptions in speech recognition , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[41] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..