暂无分享,去创建一个
[1] Joon Son Chung,et al. Lip Reading Sentences in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Themos Stafylakis,et al. Combining Residual Networks with LSTMs for Lipreading , 2017, INTERSPEECH.
[3] Song Han,et al. AMC: AutoML for Model Compression and Acceleration on Mobile Devices , 2018, ECCV.
[4] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[5] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Joon Son Chung,et al. Deep Lip Reading: a comparison of models and an online application , 2018, INTERSPEECH.
[7] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[8] Xiangyu Zhang,et al. Softer-NMS: Rethinking Bounding Box Regression for Accurate Object Detection , 2018, ArXiv.
[9] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Xiangyu Zhang,et al. Bounding Box Regression With Uncertainty for Accurate Object Detection , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[12] K. B. Heimbach. An Empirical Evaluation of Convolutional and Recurrent Neural Networks for Lip Reading , 2018 .
[13] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Naomi Harte,et al. Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition , 2018, ICMI.
[15] Chalapathy Neti,et al. Recent advances in the automatic recognition of audiovisual speech , 2003, Proc. IEEE.
[16] Joon Son Chung,et al. Lip Reading in the Wild , 2016, ACCV.
[18] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[19] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[20] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[21] Shimon Whiteson,et al. LipNet: End-to-End Sentence-level Lipreading , 2016, 1611.01599.
[22] Xiangyu Zhang,et al. Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[23] Joon Son Chung,et al. The Conversation: Deep Audio-Visual Speech Enhancement , 2018, INTERSPEECH.
[24] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[25] Maja Pantic,et al. Prediction-Based Audiovisual Fusion for Classification of Non-Linguistic Vocalisations , 2016, IEEE Transactions on Affective Computing.
[26] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[27] Amirsina Torfi,et al. 3D Convolutional Neural Networks for Cross Audio-Visual Matching Recognition , 2017, IEEE Access.
[28] D.R. Reddy,et al. Speech recognition by machine: A review , 1976, Proceedings of the IEEE.
[29] Qiang Ji,et al. A Hierarchical Context Model for Event Recognition in Surveillance Video , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[30] Maja Pantic,et al. End-to-End Audiovisual Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Juergen Luettin,et al. Audio-Visual Speech Modeling for Continuous Speech Recognition , 2000, IEEE Trans. Multim..
[32] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[33] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[34] Joon Son Chung,et al. Lip Reading in Profile , 2017, BMVC.
[35] Bolei Zhou,et al. Scene Parsing through ADE20K Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Nanning Zheng,et al. Single Image Super-resolution via a Lightweight Residual Convolutional Neural Network , 2017 .
[37] Xuelong Li,et al. Temporal Multimodal Learning in Audiovisual Speech Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[38] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[39] Thomas Paine,et al. Large-Scale Visual Speech Recognition , 2018, INTERSPEECH.
[40] Kaiming He,et al. Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[41] Yuchun Ma,et al. AddressNet: Shift-Based Primitives for Efficient Convolutional Neural Networks , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).
[42] Bo Chen,et al. MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).