Lip Reading in Profile
暂无分享,去创建一个
[1] Matti Pietikäinen,et al. A review of recent advances in visual speech decoding , 2014, Image Vis. Comput..
[2] Rainer Lienhart,et al. Reliable Transition Detection in Videos: A Survey and Practitioner's Guide , 2001, Int. J. Image Graph..
[3] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[4] Joon Son Chung,et al. Lip Reading in the Wild , 2016, ACCV.
[5] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[6] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[7] Quoc V. Le,et al. Listen, Attend and Spell , 2015, ArXiv.
[8] Jon Barker,et al. An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.
[9] Matti Pietikäinen,et al. A Compact Representation of Visual Speech Data Using Latent Variables , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[10] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[11] Joon Son Chung,et al. Out of Time: Automated Lip Sync in the Wild , 2016, ACCV Workshops.
[12] Takeo Kanade,et al. An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.
[13] Shengcai Liao,et al. Learning Face Representation from Scratch , 2014, ArXiv.
[14] Andrew Zisserman,et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.
[15] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Davis E. King,et al. Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..
[17] Kee-Eung Kim,et al. Multi-view Automatic Lip-Reading Using Neural Network , 2016, ACCV Workshops.
[18] Josephine Sullivan,et al. One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[19] Matti Pietikäinen,et al. OuluVS2: A multi-view audiovisual database for non-rigid mouth motion analysis , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).
[20] Shimon Whiteson,et al. LipNet: Sentence-level Lipreading , 2016, ArXiv.
[21] Matti Pietikäinen,et al. Concatenated Frame Image Based CNN for Visual Speech Recognition , 2016, ACCV Workshops.
[22] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[23] Joon Son Chung,et al. Lip Reading Sentences in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.
[25] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[26] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[27] Mark Liberman,et al. Speaker identification on the SCOTUS corpus , 2008 .
[28] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.