Experimenting with Attention Mechanisms in Joint CTC-Attention Models for Russian Speech Recognition
暂无分享,去创建一个
[1] Yonghong Yan,et al. Online Hybrid CTC/Attention Architecture for End-to-End Speech Recognition , 2019, INTERSPEECH.
[2] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[3] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[4] Hagen Soltau,et al. Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition , 2016, INTERSPEECH.
[5] Alexander L. Ronzhin,et al. HAVRUS Corpus: High-Speed Recordings of Audio-Visual Russian Speech , 2016, SPECOM.
[6] Irina S. Kipyatkova,et al. Investigating Joint CTC-Attention Models for End-to-End Russian Speech Recognition , 2019, SPECOM.
[7] George Saon,et al. Advancing Sequence-to-Sequence Based Speech Recognition , 2019, INTERSPEECH.
[8] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[9] Hank Liao,et al. Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[10] Irina S. Kipyatkova. Experimenting with Hybrid TDNN/HMM Acoustic Models for Russian Speech Recognition , 2017, SPECOM.
[11] Ирина Сергеевна Кипяткова,et al. Аналитический обзор интегральных систем распознавания речи , 2018 .
[12] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[13] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] A. A. Karpov,et al. Information enquiry kiosk with multimodal user interface , 2009, Pattern Recognition and Image Analysis.
[15] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[16] Alexey Karpov,et al. Class-based LSTM Russian Language Model with Linguistic Information , 2020, LREC.
[17] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[18] Zhiheng Huang,et al. Self-attention Networks for Connectionist Temporal Classification in Speech Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Shinji Watanabe,et al. ESPnet: End-to-End Speech Processing Toolkit , 2018, INTERSPEECH.
[20] Shinji Watanabe,et al. Joint CTC-attention based end-to-end speech recognition using multi-task learning , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).