论文信息 - Deep Convolutional Neural Networks for Large-scale Speech Tasks - 字舞流文

Deep Convolutional Neural Networks for Large-scale Speech Tasks

Tara N. Sainath | Brian Kingsbury | Bhuvana Ramabhadran | George Saon | Hagen Soltau | Abdel-rahman Mohamed | George E. Dahl | Abdel-rahman Mohamed | T. Sainath | Brian Kingsbury | H. Soltau | B. Ramabhadran | G. Saon

[1] George Saon,et al. Speaker adaptation of neural network acoustic models using i-vectors , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[2] Tara N. Sainath,et al. Improvements to Deep Convolutional Neural Networks for LVCSR , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[3] Tara N. Sainath,et al. Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4] Li Deng,et al. A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5] Tara N. Sainath,et al. Deep convolutional neural networks for LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6] Rob Fergus,et al. Stochastic Pooling for Regularization of Deep Convolutional Neural Networks , 2013, ICLR.

[7] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8] Tara N. Sainath,et al. FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[9] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[10] Dong Yu,et al. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.

[11] Yann LeCun,et al. Convolutional neural networks applied to house numbers digit classification , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[12] Gerald Penn,et al. Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13] Geoffrey E. Hinton,et al. Understanding how Deep Belief Networks perform acoustic modelling , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[15] Tara N. Sainath,et al. Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization , 2012, INTERSPEECH.

[16] Navdeep Jaitly,et al. Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition , 2012, INTERSPEECH.

[17] Tara N. Sainath,et al. Making Deep Belief Networks effective for large vocabulary continuous speech recognition , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[18] Tara N. Sainath,et al. Exemplar-Based Sparse Representation Features: From TIMIT to LVCSR , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[19] Brian Kingsbury,et al. The IBM Attila speech recognition toolkit , 2010, 2010 IEEE Spoken Language Technology Workshop.

[20] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.

[21] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[22] Brian Kingsbury,et al. Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[23] Brian Kingsbury,et al. Boosted MMI for model and feature-space discriminative training , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[24] Y. LeCun,et al. Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[25] Mark J. F. Gales,et al. Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..

[26] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .

[27] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..

[28] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[29] Ah Chung Tsoi,et al. Face recognition: a convolutional neural-network approach , 1997, IEEE Trans. Neural Networks.

[30] Li Lee,et al. Speaker normalization using efficient frequency warping procedures , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[31] S. J. Young,et al. Tree-based state tying for high accuracy acoustic modelling , 1994 .

[32] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .

[33] Geoffrey E. Hinton,et al. Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..