Concatenated Frame Image Based CNN for Visual Speech Recognition
暂无分享,去创建一个
Matti Pietikäinen | Guoying Zhao | Takeshi Saitoh | Ziheng Zhou | M. Pietikäinen | Guoying Zhao | Ziheng Zhou | T. Saitoh
[1] Yochai Konig,et al. "Eigenlips" for robust speech recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[2] Juergen Luettin,et al. Audio-Visual Speech Modeling for Continuous Speech Recognition , 2000, IEEE Trans. Multim..
[3] Timothy F. Cootes,et al. Extraction of Visual Features for Lipreading , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[4] Sridha Sridharan,et al. Patch-based analysis of visual speech from multiple views , 2008, AVSP.
[5] Matti Pietikäinen,et al. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON MULTIMEDIA 1 Lipreading with Local Spatiotemporal Descriptors , 2022 .
[6] Christian Wolf,et al. Sequential Deep Learning for Human Action Recognition , 2011, HBU.
[7] Daijin Kim,et al. Real-time lip reading system for isolated Korean word recognition , 2011, Pattern Recognit..
[8] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[9] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[10] Takeshi Saitoh. Efficient face model for lip reading , 2013, AVSP.
[11] Tetsuya Ogata,et al. Lipreading using convolutional neural network , 2014, INTERSPEECH.
[12] Matti Pietikäinen,et al. A review of recent advances in visual speech decoding , 2014, Image Vis. Comput..
[13] Qiang Chen,et al. Network In Network , 2013, ICLR.
[14] Mohamed R. Amer,et al. Multimodal fusion using dynamic hybrid models , 2014, IEEE Winter Conference on Applications of Computer Vision.
[15] Matti Pietikäinen,et al. OuluVS2: A multi-view audiovisual database for non-rigid mouth motion analysis , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).
[16] Takeshi Saitoh,et al. Optical flow based lip reading using non rectangular ROI and head motion reduction , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).
[17] Etsuya,et al. Audio-Visual Speech Recognition Using Convolutive Bottleneck Networks for a Person with Severe Hearing Loss , 2015 .
[18] Xuelong Li,et al. Temporal Multimodal Learning in Audiovisual Speech Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).