暂无分享,去创建一个
[1] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .
[2] Alan F. Murray,et al. Enhanced MLP performance and fault tolerance resulting from synaptic weight noise during training , 1994, IEEE Trans. Neural Networks.
[3] Hermann Ney,et al. Gammatone Features and Feature Combination for Large Vocabulary Speech Recognition , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[4] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[5] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[6] Andreas Stolcke,et al. The Microsoft 2017 Conversational Speech Recognition System , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Hermann Ney,et al. Investigation on LSTM Recurrent N-gram Language Models for Speech Recognition , 2018, INTERSPEECH.
[8] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .
[9] Xavier L. Aubert,et al. Combining TDNN and HMM in a hybrid system for improved continuous-speech recognition , 1994, IEEE Trans. Speech Audio Process..
[10] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[11] Brian Kingsbury,et al. Advancing RNN Transducer Technology for Speech Recognition , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Navdeep Jaitly,et al. Towards Better Decoding and Language Model Integration in Sequence to Sequence Models , 2016, INTERSPEECH.
[13] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[14] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[15] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Brian Kingsbury,et al. Sequence Noise Injected Training for End-to-end Speech Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[18] Ehsan Variani,et al. A Density Ratio Approach to Language Model Fusion in End-to-End Automatic Speech Recognition , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[19] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[20] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[21] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[22] Hermann Ney,et al. Acoustic Modeling of Speech Waveform Based on Multi-Resolution, Neural Network Signal Processing , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Shuang Xu,et al. Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Jeff A. Bilmes,et al. COMBINATION AND JOINT TRAINING OF ACOUSTIC CLASSIFIERS FOR SPEECH RECOGNITION , 2000 .
[25] Brian Kingsbury,et al. Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard-300 , 2020, INTERSPEECH.
[26] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.
[27] Jesse Engel,et al. Learning Multiscale Features Directly from Waveforms , 2016, INTERSPEECH.
[28] Sanjeev Khudanpur,et al. Audio augmentation for speech recognition , 2015, INTERSPEECH.
[29] H Hermansky,et al. Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.
[30] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[31] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[32] Qian Zhang,et al. Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[33] Naoyuki Kanda,et al. Elastic spectral distortion for low resource speech recognition with deep neural networks , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[34] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[35] Yoshua Bengio,et al. Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations , 2016, ICLR.
[36] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[37] Yoshua Bengio,et al. On Using Monolingual Corpora in Neural Machine Translation , 2015, ArXiv.
[38] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[40] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[41] Xiaodong Cui,et al. English Conversational Telephone Speech Recognition by Humans and Machines , 2017, INTERSPEECH.
[42] George Saon,et al. Speaker adaptation of neural network acoustic models using i-vectors , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[43] Jiri Matas,et al. On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..
[44] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[45] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[46] Matthias Sperber,et al. Self-Attentional Acoustic Models , 2018, INTERSPEECH.
[47] Kyu J. Han,et al. The CAPIO 2017 Conversational Speech Recognition System , 2017, ArXiv.
[48] Ke Li,et al. A Time-Restricted Self-Attention Layer for ASR , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[49] Yang Liu,et al. Modeling Coverage for Neural Machine Translation , 2016, ACL.
[50] Noah Constant,et al. Character-Level Language Modeling with Deeper Self-Attention , 2018, AAAI.
[51] Sathish Reddy Indurthi,et al. Small Energy Masking for Improved Neural Network Training for End-To-End Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[52] Brian Kingsbury,et al. End-to-End Spoken Language Understanding Without Full Transcripts , 2020, INTERSPEECH.
[53] Hermann Ney,et al. Training Language Models for Long-Span Cross-Sentence Evaluation , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[54] Hermann Ney,et al. Cumulative Adaptation for BLSTM Acoustic Models , 2019, INTERSPEECH.
[55] Sanjeev Khudanpur,et al. Semi-supervised maximum mutual information training of deep neural network acoustic models , 2015, INTERSPEECH.