A comparative study of 2d and 3d lip tracking methods for AV ASR
暂无分享,去创建一个
[1] Chalapathy Neti,et al. Recent advances in the automatic recognition of audiovisual speech , 2003, Proc. IEEE.
[2] Roland Göcke,et al. Monocular and Stereo Methods for AAM Learning from Video , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[3] Alexander H. Waibel,et al. Real-Time Face and Facial Feature Tracking and Applications , 1998, AVSP.
[4] Roland Göcke,et al. The audio-video australian English speech data corpus AVOZES , 2012, INTERSPEECH.
[5] Richard B. Reilly,et al. Lessons from speechreading , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..
[6] John R. Kender,et al. Finding skin in color images , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.
[7] Juergen Luettin,et al. Active Shape Models for Visual Speech Feature Extraction , 1996 .
[8] Roland Göcke,et al. Iterative Error Bound Minimisation for AAM Alignment , 2006, 18th International Conference on Pattern Recognition (ICPR'06).
[9] Christian Benoît,et al. Which components of the face do humans and machines best speechread , 1996 .
[10] Lionel Revéret,et al. A New 3D Lip Model for Analysis and Synthesis of Lip Motion In Speech Production , 1998, AVSP.
[11] Giridharan Iyengar,et al. A cascade image transform for speaker independent automatic speechreading , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).
[12] Alex Pentland,et al. 3D lip shapes from video: A combined physical-statistical model , 1998, Speech Commun..
[13] I. H. Öğüş,et al. NATO ASI Series , 1997 .
[14] Jason Mora Saragih. The generative learning and discriminative fitting of linear deformable models , 2008 .
[15] Ralph Gross,et al. Constructing and Fitting Active Appearance Models With Occlusion , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.
[16] Roland Göcke. 3d Lip Tracking and Co-inertia Analysis for Improved Robustness of Audio-video Automatic Speech Recognition , 2005, AVSP.
[17] Simon Baker,et al. Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.
[18] Roland Göcke. Current trends in joint audio-video signal processing: a review , 2005, Proceedings of the Eighth International Symposium on Signal Processing and Its Applications, 2005..
[19] Eric David Petajan,et al. Automatic Lipreading to Enhance Speech Recognition (Speech Reading) , 1984 .
[20] Michael Wagner,et al. Aspects of speaking-face data corpus design methodology , 2004, INTERSPEECH.
[21] Simon Baker,et al. Equivalence and efficiency of image alignment algorithms , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.
[22] Eric D. Petajan. Automatic lipreading to enhance speech recognition , 1984 .
[23] Timothy F. Cootes,et al. Lipreading Using Shape, Shading and Scale , 1998, AVSP.
[24] David C. Gibbon,et al. Multi-modal system for locating heads and faces , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.
[25] Timothy F. Cootes,et al. Active Appearance Models , 1998, ECCV.
[26] Thomas Vetter,et al. Face Recognition Based on Fitting a 3D Morphable Model , 2003, IEEE Trans. Pattern Anal. Mach. Intell..
[27] Paul A. Viola,et al. Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.
[28] Timothy F. Cootes,et al. Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..
[29] Michael Vogt. Fast Matching of a Dynamic Lip Model to Color Video Sequences under Regular Illumination Conditions , 1996 .
[30] Hans-Heinrich Bothe. Relations of Audio and Visual Speech Signals in a Physical Feature Space: Implications for the Hearing-impaired , 1996 .