Understanding Pictograph with Facial Features: End-to-End Sentence-Level Lip Reading of Chinese
暂无分享,去创建一个
Fan Yang | Ming Liu | Xiaobing Zhang | Nianbo Liu | Haigang Gong | Xili Dai | Ming Liu | Fan Yang | Xiaobing Zhang | Hai-gang Gong | Xili Dai | Nianbo Liu
[1] Stephen J. Cox,et al. The challenge of multispeaker lip-reading , 2008, AVSP.
[2] Tetsuya Ogata,et al. Audio-visual speech recognition using deep learning , 2014, Applied Intelligence.
[3] Shimon Whiteson,et al. LipNet: End-to-End Sentence-level Lipreading , 2016, 1611.01599.
[4] Matti Pietikäinen,et al. A review of recent advances in visual speech decoding , 2014, Image Vis. Comput..
[5] Matti Pietikäinen,et al. OuluVS2: A multi-view audiovisual database for non-rigid mouth motion analysis , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).
[6] Jing Huang,et al. Audio-visual deep learning for noise robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[7] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[8] Maja Pantic,et al. Deep complementary bottleneck features for visual speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[10] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[11] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[12] Joon Son Chung,et al. Out of Time: Automated Lip Sync in the Wild , 2016, ACCV Workshops.
[13] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[14] Vaibhava Goel,et al. Deep multimodal learning for Audio-Visual Speech Recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Amit Garg amit,et al. Lip reading using CNN and LSTM , 2016 .
[16] Joon Son Chung,et al. Lip Reading Sentences in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Andrew Zisserman,et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.
[18] Jürgen Schmidhuber,et al. Lipreading with long short-term memory , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[20] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Timothy F. Cootes,et al. Extraction of Visual Features for Lipreading , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[22] Tetsuya Ogata,et al. Lipreading using convolutional neural network , 2014, INTERSPEECH.
[23] Qingyuan Tian. Comparing information entropy between Chinese and English language , 2012 .
[24] Stephen J. Cox,et al. Improving lip-reading performance for robust audiovisual speech recognition using DNNs , 2015, AVSP.
[25] Chalapathy Neti,et al. Recent advances in the automatic recognition of audiovisual speech , 2003, Proc. IEEE.
[26] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Daniel Jurafsky,et al. Lexicon-Free Conversational Speech Recognition with Neural Networks , 2015, NAACL.
[28] Matti Pietikäinen,et al. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON MULTIMEDIA 1 Lipreading with Local Spatiotemporal Descriptors , 2022 .