Deep Neural Networks for Acoustic Modeling in Speech Recognition
暂无分享,去创建一个
Tara N. Sainath | Brian Kingsbury | Geoffrey E. Hinton | Dong Yu | Li Deng | Navdeep Jaitly | Patrick Nguyen | Abdel-rahman Mohamed | Vincent Vanhoucke | George E. Dahl | Andrew W. Senior | Tara N. Sainath | Vincent Vanhoucke | A. Senior | Abdel-rahman Mohamed | L. Deng | Dong Yu | Navdeep Jaitly | P. Nguyen | T. Sainath | Brian Kingsbury | N. Jaitly | Patrick Nguyen
[1] S. Furui,et al. Cepstral analysis technique for automatic speaker verification , 1981 .
[2] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[3] Biing-Hwang Juang,et al. Maximum likelihood estimation for multivariate mixture observations of markov chains , 1986, IEEE Trans. Inf. Theory.
[4] Lalit R. Bahl,et al. Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[5] 古井 貞煕,et al. Digital speech processing, synthesis, and recognition , 1989 .
[6] Sadaoki Furui,et al. Digital Speech Processing, Synthesis, and Recognition , 1989 .
[7] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[8] H Hermansky,et al. Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.
[9] R. Kompe,et al. Global optimization of a neural network-hidden Markov model hybrid , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[10] Yoshua Bengio,et al. Global optimization of a neural network-hidden Markov model hybrid , 1992, IEEE Trans. Neural Networks.
[11] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .
[12] Li Deng,et al. Speech recognition using the atomic speech units constructed from overlapping articulatory features , 1994, EUROSPEECH.
[13] Anthony J. Robinson,et al. An application of recurrent nets to phone probability estimation , 1994, IEEE Trans. Neural Networks.
[14] S. Young. Large Vocabulary Continuous Speech Recognition : a ReviewSteve , 1996 .
[15] Hermann Ney,et al. A word graph algorithm for large vocabulary continuous speech recognition , 1994, Comput. Speech Lang..
[16] James R. Glass,et al. Heterogeneous measurements and multiple classifiers for speech recognition , 1998, ICSLP.
[17] Teuvo Kohonen,et al. Speech recognition: a hybrid approach , 1998 .
[18] Francis Jack Smith,et al. Improved phone recognition using Bayesian triphone models , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[19] Li Deng,et al. Computational Models for Speech Production , 2018, Speech Processing.
[20] Daniel P. W. Ellis,et al. Tandem connectionist feature extraction for conventional HMM systems , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[21] Daniel Povey,et al. Large scale discriminative training of hidden Markov models for speech recognition , 2002, Comput. Speech Lang..
[22] Li Deng,et al. An overlapping-feature-based phonological model incorporating linguistic constraints: applications to speech recognition. , 2002, The Journal of the Acoustical Society of America.
[23] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.
[24] Li Deng,et al. Switching Dynamic System Models for Speech Articulation and Acoustics , 2004 .
[25] N. Morgan,et al. Pushing the envelope - aside [speech recognition] , 2005, IEEE Signal Processing Magazine.
[26] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[27] Dong Yu,et al. Structured speech modeling , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[28] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[29] Yoshua Bengio,et al. An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.
[30] Dong Yu,et al. Use of Differential Cepstra as Acoustic Features in Hidden Trajectory Modeling for Phonetic Recognition , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[31] Brian Kingsbury,et al. Boosted MMI for model and feature-space discriminative training , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[32] Wu Chou,et al. Discriminative learning in sequential pattern recognition , 2008, IEEE Signal Processing Magazine.
[33] Brian Kingsbury,et al. Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[34] James Glass,et al. Research Developments and Directions in Speech Recognition and Understanding, Part 1 , 2009 .
[35] Tara N. Sainath,et al. An exploration of large vocabulary tools for small vocabulary phonetic recognition , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.
[36] James R. Glass,et al. Developments and directions in speech recognition and understanding, Part 1 [DSP Education] , 2009, IEEE Signal Processing Magazine.
[37] Honglak Lee,et al. Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.
[38] Steve Renals,et al. Speech Recognition Using Augmented Conditional Random Fields , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[39] Geoffrey E. Hinton,et al. Deep Belief Networks for phone recognition , 2009 .
[40] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[41] Eric Fosler-Lussier,et al. Backpropagation training for multilayer conditional random field based phone recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[42] Geoffrey E. Hinton,et al. Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine , 2010, NIPS.
[43] VincentPascal,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010 .
[44] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[45] Luca Maria Gambardella,et al. Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.
[46] Dong Yu,et al. Investigation of full-sequence training of deep belief networks for speech recognition , 2010, INTERSPEECH.
[47] Pascal Vincent,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..
[48] Dong Yu,et al. Roles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition , 2010 .
[49] Quoc V. Le,et al. On optimization methods for deep learning , 2011, ICML.
[50] Pascal Vincent,et al. Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.
[51] Geoffrey Zweig,et al. Speech recognitionwith segmental conditional random fields: A summary of the JHU CLSP 2010 Summer Workshop , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[52] Dong Yu,et al. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.
[53] Dong Yu,et al. Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[54] Tara N. Sainath,et al. Exemplar-Based Sparse Representation Features: From TIMIT to LVCSR , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[55] Dong Yu,et al. Deep Convex Net: A Scalable Architecture for Speech Pattern Classification , 2011, INTERSPEECH.
[56] Tara N. Sainath,et al. Deep Belief Networks using discriminative features for phone recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[57] Oriol Vinyals,et al. Comparing multilayer perceptron to Deep Belief Network Tandem features for robust ASR , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[58] Vincent Vanhoucke,et al. Improving the speed of neural networks on CPUs , 2011 .
[59] Geoffrey E. Hinton,et al. Understanding how Deep Belief Networks perform acoustic modelling , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[60] Tara N. Sainath,et al. Auto-encoder bottleneck features using deep belief networks , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[61] Dong Yu,et al. Scalable stacking and learning for building deep architectures , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[62] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[63] Hynek Hermansky,et al. Sparse Multilayer Perceptron for Phoneme Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[64] Heiga Zen,et al. Product of Experts for Statistical Parametric Speech Synthesis , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[65] Nelson Morgan,et al. Deep and Wide: Multiple Layers in Automatic Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[66] Gerald Penn,et al. Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[67] Navdeep Jaitly,et al. Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition , 2012, INTERSPEECH.
[68] Tara N. Sainath,et al. Improved pre-training of Deep Belief Networks using Sparse Encoding Symmetric Machines , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[69] Geoffrey E. Hinton,et al. Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[70] Navdeep Jaitly,et al. Application of Pretrained Deep Neural Networks to Large Vocabulary Conversational Speech Recognition , 2012 .
[71] Chin-Hui Lee,et al. Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[72] Dong Yu,et al. A deep architecture with bilinear modeling of hidden representations: Applications to phonetic recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[73] Geoffrey E. Hinton. A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.
[74] James R. Glass,et al. Developments and Directions in Speech Recognition and Understanding , Part 1 T , 2022 .