Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers
暂无分享,去创建一个
Haihong Tang | Xinchao Wang | Mingli Song | Rui Xu | Peng Hou | Ya Zhao | Xinchao Wang | Mingli Song | Haihong Tang | Ya Zhao | Rui Xu | Peng Hou
[1] Samy Bengio,et al. Tacotron: Towards End-to-End Speech Synthesis , 2017, INTERSPEECH.
[2] Andrew Zisserman,et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.
[3] Gregory J. Wolff,et al. Lipreading by Neural Networks: Visual Preprocessing, Learning, and Sensory Integration , 1993, NIPS.
[4] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[5] Themos Stafylakis,et al. Combining Residual Networks with LSTMs for Lipreading , 2017, INTERSPEECH.
[6] Yoshua Bengio,et al. FitNets: Hints for Thin Deep Nets , 2014, ICLR.
[7] Maja Pantic,et al. Audio-Visual Speech Recognition with a Hybrid CTC/Attention Architecture , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[8] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Li Zhao,et al. Word Attention for Sequence to Sequence Text Understanding , 2018, AAAI.
[10] Koichi Shinoda,et al. Sequence-level Knowledge Distillation for Model Compression of Attention-based Sequence-to-sequence Speech Recognition , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Joon Son Chung,et al. Lip Reading in Profile , 2017, BMVC.
[12] Shimon Whiteson,et al. LipNet: Sentence-level Lipreading , 2016, ArXiv.
[13] Samyam Rajbhandari,et al. LONG SHORT-TERM MEMORY , 2018 .
[14] Joon Son Chung,et al. Deep Audio-Visual Speech Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[15] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Yoshua Bengio,et al. End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results , 2014, ArXiv.
[17] Joon Son Chung,et al. Lip Reading Sentences in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Danna Zhou,et al. d. , 1934, Microbial pathogenesis.
[20] Naomi Harte,et al. Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition , 2018, ICMI.
[21] Mingli Song,et al. A Cascade Sequence-to-Sequence Model for Chinese Mandarin Lip Reading , 2019, MMAsia.
[22] Hao Zheng,et al. AISHELL-1: An open-source Mandarin speech corpus and a speech recognition baseline , 2017, 2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA).
[23] Chunxiao Liu,et al. Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks , 2018, AAAI.
[24] Martin Wattenberg,et al. SmoothGrad: removing noise by adding noise , 2017, ArXiv.
[25] Thomas Paine,et al. Large-Scale Visual Speech Recognition , 2018, INTERSPEECH.
[26] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[27] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[28] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[29] Jitendra Malik,et al. Cross Modal Distillation for Supervision Transfer , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[31] Tsuyoshi Murata,et al. {m , 1934, ACML.
[32] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[34] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.