Direct hand pose estimation for immersive gestural interaction

We present a novel approach for 3D gestural interaction based on depth images.We propose a direct method for real-time hand pose estimation.Our method can be generalized to accommodate different interaction systems.The system is implemented and tested on desktop computer and mobile platform.Usability analysis reveals that our system can be used in real-world applications. This paper presents a novel approach for performing intuitive gesture-based interaction using depth data acquired by Kinect. The main challenge to enable immersive gestural interaction is dynamic gesture recognition. This problem can be formulated as a combination of two tasks; gesture recognition and gesture pose estimation. Incorporation of fast and robust pose estimation method would lessen the burden to a great extent. In this paper we propose a direct method for real-time hand pose estimation. Based on the range images, a new version of optical flow constraint equation is derived, which can be utilized to directly estimate 3D hand motion without any need of imposing other constraints. Extensive experiments illustrate that the proposed approach performs properly in real-time with high accuracy. As a proof of concept, we demonstrate the system performance in 3D object manipulation on two different setups; desktop computing, and mobile platform. This reveals the system capability to accommodate different interaction procedures. In addition, a user study is conducted to evaluate learnability, user experience and interaction quality in 3D gestural interaction in comparison to 2D touchscreen interaction.

[1]  Toby Sharp,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[2]  Berthold K. P. Horn Robot vision , 1986, MIT electrical engineering and computer science series.

[3]  Markus Koskela,et al.  Online RGB-D gesture recognition with extreme learning machines , 2013, ICMI '13.

[4]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[5]  Robin R. Murphy,et al.  Hand gesture recognition with depth images: A review , 2012, 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication.

[6]  Michael I. Mandel,et al.  Visual Hand Tracking Using Nonparametric Belief Propagation , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[7]  Li Liu,et al.  3D Active Human Motion Estimation for Biomedical Applications , 2012 .

[8]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[9]  Luc Van Gool,et al.  Tracking a hand manipulating an object , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[10]  John G. Harris,et al.  Rigid body motion from range image sequences , 1991, CVGIP Image Underst..

[11]  Pertti Roivainen,et al.  3-D Motion Estimation in Model-Based Facial Image Coding , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Z. Liu,et al.  A real time system for dynamic hand gesture recognition with a depth sensor , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[13]  Mircea Nicolescu,et al.  Vision-based hand pose estimation: A review , 2007, Comput. Vis. Image Underst..

[14]  Luc Van Gool,et al.  Real-time 3D hand gesture interaction with a robot for understanding directions from humans , 2011, 2011 RO-MAN.

[15]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[16]  Antonis A. Argyros,et al.  Efficient model-based 3D tracking of hand articulations using Kinect , 2011, BMVC.

[17]  Fredrik Öhberg,et al.  Comparative analysis of different adaptive filters for tracking lower segments of a human body using inertial motion sensors , 2013 .

[18]  Paulo R. S. Mendonça,et al.  Model-based 3D tracking of an articulated hand , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[19]  Stefan Gheorghe Pentiuc,et al.  Robust 3D Hand Detection for Gestures Recognition , 2011, IDC.

[20]  Miles MacLeod,et al.  The MUSiC performance measurement method , 1997, Behav. Inf. Technol..

[21]  Jakob Nielsen,et al.  A mathematical model of the finding of usability problems , 1993, INTERCHI.

[22]  Anbumani Subramanian,et al.  Dynamic Hand Pose Recognition Using Depth Data , 2010, 2010 20th International Conference on Pattern Recognition.

[23]  Haibo Li,et al.  3D head pose estimation using the Kinect , 2011, 2011 International Conference on Wireless Communications and Signal Processing (WCSP).

[24]  Farid Abedan Kondori Bring Your Body into Action : Body Gesture Detection, Tracking, and Analysis for Natural Interaction , 2014 .

[25]  Haibo Li,et al.  Experiencing real 3D gestural interaction with mobile devices , 2013, Pattern Recognit. Lett..

[26]  Antonis A. Argyros,et al.  Tracking the articulated motion of two strongly interacting hands , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Haibo Li,et al.  Real 3D interaction behind mobile phones for augmented environments , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[28]  Horst Bischof,et al.  Real-Time Hand Gesture Recognition in a Virtual 3D Environment , 2012 .

[29]  Berthold K. P. Horn,et al.  Direct methods for recovering motion , 1988, International Journal of Computer Vision.

[30]  Lale Akarun,et al.  Real time hand pose estimation using depth sensors , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[31]  Luc Van Gool,et al.  Real-time sign language letter and word recognition from depth data , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[32]  Junsong Yuan,et al.  Robust Part-Based Hand Gesture Recognition Using Kinect Sensor , 2013, IEEE Transactions on Multimedia.

[33]  Andrew Zisserman,et al.  Multiple View Geometry , 1999 .