Real-time fingertip localization conditioned on hand gesture classification

A method to obtain accurate hand gesture classification and fingertip localization from depth images is proposed. The Oriented Radial Distribution feature is utilized, exploiting its ability to globally describe hand poses, but also to locally detect fingertip positions. Hence, hand gesture and fingertip locations are characterized with a single feature calculation. We propose to divide the difficult problem of locating fingertips into two more tractable problems, by taking advantage of hand gesture as an auxiliary variable. Along with the method we present the ColorTip dataset, a dataset for hand gesture recognition and fingertip classification using depth data. ColorTip contains sequences where actors wear a glove with colored fingertips, allowing automatic annotation. The proposed method is evaluated against recent works in several datasets, achieving promising results in both gesture classification and fingertip localization.

[1]  Sebastian Thrun,et al.  Real time motion capture using a single time-of-flight camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[3]  Lale Akarun,et al.  Real time hand pose estimation using depth sensors , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[4]  Joachim Hornegger,et al.  Gesture recognition with a Time-Of-Flight camera , 2008, Int. J. Intell. Syst. Technol. Appl..

[5]  Xiaolong Zhu,et al.  Single-frame hand gesture recognition using color and depth kernel descriptors , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[6]  Rod McCall,et al.  Lightweight palm and finger tracking for real-time 3D gesture control , 2011, 2011 IEEE Virtual Reality Conference.

[7]  David A. Forsyth,et al.  Generalizing motion edits with Gaussian processes , 2009, ACM Trans. Graph..

[8]  Antonis A. Argyros,et al.  Efficient model-based 3D tracking of hand articulations using Kinect , 2011, BMVC.

[9]  Hans-Peter Seidel,et al.  A data-driven approach for real-time full body pose reconstruction from a depth camera , 2011, 2011 International Conference on Computer Vision.

[10]  J. J. McGregor,et al.  Backtrack search algorithms and the maximal common subgraph problem , 1982, Softw. Pract. Exp..

[11]  Adolfo López,et al.  INTAIRACT: Joint Hand Gesture and Fingertip Classification for Touchless Interaction , 2012, ECCV Workshops.

[12]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[13]  Jitendra Malik,et al.  Recognizing Objects in Range Data Using Regional Point Descriptors , 2004, ECCV.

[14]  Philippe Salembier,et al.  Binary partition tree as an efficient representation for image processing, segmentation, and information retrieval , 2000, IEEE Trans. Image Process..

[15]  Junsong Yuan,et al.  Robust hand gesture recognition based on finger-earth mover's distance with a commodity depth camera , 2011, ACM Multimedia.

[16]  Luc Van Gool,et al.  Real-time sign language letter and word recognition from depth data , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[17]  Gary R. Bradski,et al.  Fast 3D recognition and pose using the Viewpoint Feature Histogram , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  Nicolas Pugeault,et al.  Spelling it out: Real-time ASL fingerspelling recognition , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[19]  Jovan Popović,et al.  Real-time hand-tracking with a color glove , 2009, SIGGRAPH 2009.

[20]  Ulrich Neumann,et al.  Real-time Hand Pose Recognition Using Low-Resolution Depth Images , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Nassir Navab,et al.  Human skeleton tracking from depth data using geodesic distances and optical flow , 2012, Image Vis. Comput..

[22]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[23]  Xia Liu,et al.  Hand gesture recognition using depth data , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[24]  Lale Akarun,et al.  Randomized decision forests for static and dynamic hand shape classification , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[25]  Federico Tombari,et al.  Unique Signatures of Histograms for Local Surface Description , 2010, ECCV.

[26]  Javier Ruiz Hidalgo,et al.  Oriented radial distribution on depth data: Application to the detection of end-effectors , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27]  Michael G. Strintzis,et al.  Real-time hand posture recognition using range data , 2008, Image Vis. Comput..

[28]  Luc Van Gool,et al.  Combining RGB and ToF cameras for real-time 3D hand gesture interaction , 2011, WACV.

[29]  Xiaodong Yang,et al.  Histogram of 3D Facets: A characteristic descriptor for hand gesture recognition , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[30]  Radu Bogdan Rusu,et al.  Semantic 3D Object Maps for Everyday Manipulation in Human Living Environments , 2010, KI - Künstliche Intelligenz.

[31]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[32]  Javier Ruiz Hidalgo,et al.  Real-Time Head and Hand Tracking Based on 2.5D Data , 2012 .

[33]  Michael Beetz,et al.  Detecting and segmenting objects for mobile manipulation , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[34]  Luc Van Gool,et al.  Haarlet-based hand gesture recognition for 3D interaction , 2009, 2009 Workshop on Applications of Computer Vision (WACV).

[35]  Joachim Hornegger,et al.  3-D gesture-based scene navigation in medical imaging applications using Time-of-Flight cameras , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.