论文信息 - Multi-modal user interaction method based on gaze tracking and gesture recognition

Multi-modal user interaction method based on gaze tracking and gesture recognition

This paper presents a gaze tracking technology which provides a convenient human-centric interface for multimedia consumption without any wearable device. It enables a user to interact with various multimedia on a large display in distance by tracking user movement and acquiring high resolution eye images. This paper also presents a gesture recognition technology which is helpful to interact with scene descriptions in terms of controlling and rendering scene objects. It is based on Hidden Markov Model and CRF using a commercial depth sensor. And then, this paper shows a collaboration method with those new sensors and MPEG standards in order to achieve interoperability among interactive applications, new user interaction devices and users.

[1] John Daugman,et al. New Methods in Iris Recognition , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[2] Francisco José Madrid-Cuevas,et al. Depth silhouettes for gesture recognition , 2008, Pattern Recognit. Lett..

[3] G. Medioni,et al. Human pose estimation from a single view point , 2009 .

[4] Luc Van Gool,et al. Real-time 3D hand gesture interaction with a robot for understanding directions from humans , 2011, 2011 RO-MAN.

[5] Rainer Stiefelhagen,et al. Real-Time Person Tracking and Pointing Gesture Recognition for Human-Robot Interaction , 2004, ECCV Workshop on HCI.

[6] Antonis A. Argyros,et al. Efficient model-based 3D tracking of hand articulations using Kinect , 2011, BMVC.

[7] Gérard G. Medioni,et al. Human pose estimation from a single view point, real-time range sensor , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[8] Paul A. Viola,et al. Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[9] Wanqing Li,et al. Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[10] Seong-Whan Lee,et al. Gesture Spotting and Recognition for Human–Robot Interaction , 2007, IEEE Transactions on Robotics.

[11] Robyn A. Owens,et al. Australian sign language recognition , 2005, Machine Vision and Applications.

[12] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[13] Andrew J. Viterbi,et al. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[14] Anbumani Subramanian,et al. Dynamic Hand Pose Recognition Using Depth Data , 2010, 2010 20th International Conference on Pattern Recognition.

[15] Jinwoong Kim,et al. Face and eye tracking for sub-hologram-based digital holographic display system , 2012, Defense, Security, and Sensing.

[16] Alex Pentland,et al. Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[17] Jin-Hyung Kim,et al. An HMM-Based Threshold Model Approach for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[18] Bart Selman,et al. Human Activity Detection from RGBD Images , 2011, Plan, Activity, and Intent Recognition.

[19] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .

[20] Peter D. Lawrence,et al. Improving the Accuracy and Reliability of Remote System-Calibration-Free Eye-Gaze Tracking , 2009, IEEE Transactions on Biomedical Engineering.