暂无分享,去创建一个
Hermann Ney | Ralf Schlüter | Alexander Gerstenberger | Mohammad Zeineldeen | Jingjing Xu | Christoph Lüscher | Wilfried Michel
[1] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[2] John J. Godfrey,et al. SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[3] Abien Fred Agarap. Deep Learning using Rectified Linear Units (ReLU) , 2018, ArXiv.
[4] Hermann Ney,et al. Speaker adaptive joint training of Gaussian mixture models and bottleneck features , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[5] Lei Xie,et al. Efficient Conformer with Prob-Sparse Attention Mechanism for End-to-EndSpeech Recognition , 2021, Interspeech.
[6] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Hermann Ney,et al. A comprehensive study of deep bidirectional LSTM RNNS for acoustic modeling in speech recognition , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Gerald Penn,et al. Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[9] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Hermann Ney,et al. The Rwth Asr System for Ted-Lium Release 2: Improving Hybrid Hmm With Specaugment , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[12] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[13] S. J. Young,et al. Tree-based state tying for high accuracy acoustic modelling , 1994 .
[14] Hermann Ney,et al. RWTH ASR Systems for LibriSpeech: Hybrid vs Attention - w/o Data Augmentation , 2019, INTERSPEECH.
[15] Hermann Ney,et al. Training Language Models for Long-Span Cross-Sentence Evaluation , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[16] Geoffrey Zweig,et al. DEJA-VU: Double Feature Presentation and Iterated Loss in Deep Transformer Networks , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Geoffrey Zweig,et al. Transformer-Based Acoustic Modeling for Hybrid Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Ross B. Girshick,et al. Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[19] Matthias Sperber,et al. Self-Attentional Acoustic Models , 2018, INTERSPEECH.
[20] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Timothy Dozat,et al. Incorporating Nesterov Momentum into Adam , 2016 .
[22] Brian Kingsbury,et al. On the limit of English conversational speech recognition , 2021, Interspeech.
[23] Thomas Hain,et al. Hypothesis spaces for minimum Bayes risk training in large vocabulary speech recognition , 2006, INTERSPEECH.
[24] Hermann Ney,et al. Gammatone Features and Feature Combination for Large Vocabulary Speech Recognition , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[25] Hermann Ney,et al. RASR/NN: The RWTH neural network toolkit for speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .
[27] Sanjeev Khudanpur,et al. A time delay neural network architecture for efficient modeling of long temporal contexts , 2015, INTERSPEECH.
[28] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[29] Yonghui Wu,et al. ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context , 2020, INTERSPEECH.
[30] Hermann Ney,et al. LSTM Language Models for LVCSR in First-Pass Decoding and Lattice-Rescoring , 2019, ArXiv.
[31] Valentin Vielzeuf,et al. Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition , 2021, 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[32] Brian Kingsbury,et al. Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard-300 , 2020, INTERSPEECH.
[33] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.
[34] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[36] Hermann Ney,et al. Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[37] Hermann Ney,et al. Cumulative Adaptation for BLSTM Acoustic Models , 2019, INTERSPEECH.
[38] Hermann Ney,et al. RETURNN as a Generic Flexible Neural Toolkit with Application to Translation and Speech Recognition , 2018, ACL.
[39] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.