Hand Pose Regression via a Classification-Guided Approach

Hand pose estimation from single depth image has achieved great progress in recent years, however, up-to-data methods are still not satisfying the application requirements like in human-computer interaction. One possible reason is that existing methods try to learn a general regression function for all types of hand depth images. To handle this problem, we propose a novel “divide-and-conquer” method, which includes a classification step and a regression step. At first, a convolutional neural network classifier is used to classify the input hand depth image into different types. Then, an effective and efficient multiway cascaded random forest regressor is used to estimate the hand joints’ 3D positions. Experiments demonstrate that the proposed method achieves state-of-the-art performance on challenging dataset. Moreover, the proposed method can be easily combined with other regression method.

[1]  Andrea Tagliasacchi,et al.  Robust Articulated-ICP for Real-Time Hand Tracking , 2015 .

[2]  Min Sun,et al.  Conditional regression forests for human pose estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Jonathan Tompson,et al.  Learning Human Pose Estimation Features with Convolutional Networks , 2013, ICLR.

[4]  Antonis A. Argyros,et al.  Efficient model-based 3D tracking of hand articulations using Kinect , 2011, BMVC.

[5]  Mircea Nicolescu,et al.  Vision-based hand pose estimation: A review , 2007, Comput. Vis. Image Underst..

[6]  Christian Wolf,et al.  Hand Pose Estimation through Weakly-Supervised Learning of a Rich Intermediate Representation , 2015, ArXiv.

[7]  Vincent Lepetit,et al.  Training a Feedback Loop for Hand Pose Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Gabriel Zachmann,et al.  A Survey of Vision-Based Markerless Hand Tracking Approaches , 2013 .

[9]  Marc Pollefeys,et al.  Capturing Hands in Action Using Discriminative Salient Points and Physics Simulation , 2015, International Journal of Computer Vision.

[10]  Antti Oulasvirta,et al.  Interactive Markerless Articulated Hand Motion Tracking Using RGB and Depth Data , 2013, 2013 IEEE International Conference on Computer Vision.

[11]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[12]  Sebastian Thrun,et al.  Real-Time Human Pose Tracking from Range Data , 2012, ECCV.

[13]  Daniel Thalmann,et al.  Robust 3D Hand Pose Estimation in Single Depth Images: From Single-View CNN to Multi-View CNNs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Qing Zhang,et al.  A Survey on Human Motion Analysis from Depth Data , 2013, Time-of-Flight and Depth Imaging.

[15]  Vincent Lepetit,et al.  Hands Deep in Deep Learning for Hand Pose Estimation , 2015, ArXiv.

[16]  Yi Yang,et al.  Depth-Based Hand Pose Estimation: Data, Methods, and Challenges , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Marc Pollefeys,et al.  Joint Camera Pose Estimation and 3D Human Pose Estimation in a Multi-camera Setup , 2014, ACCV.

[18]  Ken Perlin,et al.  Real-Time Continuous Pose Recovery of Human Hands Using Convolutional Networks , 2014, ACM Trans. Graph..

[19]  Lale Akarun,et al.  Hand Pose Estimation and Hand Shape Classification Using Multi-layered Randomized Decision Forests , 2012, ECCV.

[20]  Jian Sun,et al.  Cascaded hand pose regression , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jovan Popović,et al.  Real-time hand-tracking with a color glove , 2009, SIGGRAPH 2009.

[22]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[23]  Tae-Kyun Kim,et al.  Opening the Black Box: Hierarchical Sampling Optimization for Estimating Human Hand Pose , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Karthik Ramani,et al.  A Collaborative Filtering Approach to Real-Time Hand Pose Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Yi Yang,et al.  Depth-Based Hand Pose Estimation: Methods, Data, and Challenges , 2015, International Journal of Computer Vision.

[26]  Tae-Kyun Kim,et al.  Latent Regression Forest: Structured Estimation of 3D Articulated Hand Posture , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Andrew W. Fitzgibbon,et al.  Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences , 2016, ACM Trans. Graph..

[28]  Hans-Peter Seidel,et al.  Real-Time Hand Tracking Using a Sum of Anisotropic Gaussians Model , 2014, 2014 2nd International Conference on 3D Vision.

[29]  Antti Oulasvirta,et al.  Fast and robust hand tracking using detection-guided optimization , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Andrew W. Fitzgibbon,et al.  Accurate, Robust, and Flexible Real-time Hand Tracking , 2015, CHI.

[31]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[32]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[33]  Tae-Kyun Kim,et al.  Real-Time Articulated Hand Pose Estimation Using Semi-supervised Transductive Regression Forests , 2013, 2013 IEEE International Conference on Computer Vision.

[34]  Chen Qian,et al.  Realtime and Robust Hand Tracking from Depth , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[36]  Antonis A. Argyros,et al.  Markerless and Efficient 26-DOF Hand Pose Recovery , 2010, ACCV.

[37]  Marc Levoy,et al.  Efficient variants of the ICP algorithm , 2001, Proceedings Third International Conference on 3-D Digital Imaging and Modeling.

[38]  Haibin Ling,et al.  3D Hand Pose Estimation Using Randomized Decision Forest with Segmentation Index Points , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[39]  Antonis A. Argyros,et al.  Hybrid One-Shot 3D Hand Pose Estimation by Exploiting Uncertainties , 2015, BMVC.

[40]  Vincent Lepetit,et al.  Efficiently Creating 3D Training Data for Fine Hand Pose Estimation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Li Cheng,et al.  Efficient Hand Pose Estimation from a Single Depth Image , 2013, 2013 IEEE International Conference on Computer Vision.