Occlusion Aware Hand Pose Recovery from Sequences of Depth Images

State-of-the-art approaches on hand pose estimation from depth images have reported promising results under quite controlled considerations. In this paper we propose a two-step pipeline for recovering the hand pose from a sequence of depth images. The pipeline has been designed to deal with images taken from any viewpoint and exhibiting a high degree of finger occlusion. In a first step we initialize the hand pose using a part-based model, fitting a set of hand components in the depth images. In a second step we consider temporal data and estimate the parameters of a trained bilinear model consisting of shape and trajectory bases. Results on a synthetic, highly-occluded dataset demonstrate that the proposed method outperforms most recent pose recovering approaches, including those based on CNNs.

[1]  Carolina Cruz-Neira,et al.  Surround-Screen Projection-Based Virtual Reality: The Design and Implementation of the CAVE , 2023 .

[2]  Yi Yang,et al.  Depth-Based Hand Pose Estimation: Methods, Data, and Challenges , 2015, International Journal of Computer Vision.

[3]  Adrien Bartoli,et al.  Sequential Non-Rigid Structure-from-Motion with the 3D-Implicit Low-Rank Shape Model , 2010, ECCV.

[4]  Narendra Ahuja,et al.  Robust Orthonormal Subspace Learning: Efficient Recovery of Corrupted Low-Rank Matrices , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Jordi Gonzàlez,et al.  Human Pose Estimation from Monocular Images: A Comprehensive Survey , 2016, Sensors.

[6]  Jake K. Aggarwal,et al.  View invariant human action recognition using histograms of 3D joints , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[7]  Deva Ramanan,et al.  First-person pose recognition using egocentric workspaces , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Antti Oulasvirta,et al.  Interactive Markerless Articulated Hand Motion Tracking Using RGB and Depth Data , 2013, 2013 IEEE International Conference on Computer Vision.

[9]  Bogdan J. Matuszewski,et al.  3D Deformable Shape Reconstruction with Diffusion Maps , 2013, BMVC.

[10]  Zhengyou Zhang,et al.  Iterative point matching for registration of free-form curves and surfaces , 1994, International Journal of Computer Vision.

[11]  Vincent Lepetit,et al.  Hands Deep in Deep Learning for Hand Pose Estimation , 2015, ArXiv.

[12]  Sergio Escalera,et al.  A Survey on Model Based Approaches for 2D and 3D Visual Human Pose Recovery , 2014, Sensors.

[13]  Jian Sun,et al.  Cascaded hand pose regression , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[15]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[16]  Tae-Kyun Kim,et al.  Latent Regression Forest: Structured Estimation of 3D Articulated Hand Posture , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Karthik Ramani,et al.  A Collaborative Filtering Approach to Real-Time Hand Pose Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Daniel Thalmann,et al.  Robust 3D Hand Pose Estimation in Single Depth Images: From Single-View CNN to Multi-View CNNs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Antonis A. Argyros,et al.  Efficient model-based 3D tracking of hand articulations using Kinect , 2011, BMVC.

[20]  Ken Perlin,et al.  Real-Time Continuous Pose Recovery of Human Hands Using Convolutional Networks , 2014, ACM Trans. Graph..

[21]  Andrew W. Fitzgibbon,et al.  Accurate, Robust, and Flexible Real-time Hand Tracking , 2015, CHI.

[22]  Lale Akarun,et al.  Hand Pose Estimation and Hand Shape Classification Using Multi-layered Randomized Decision Forests , 2012, ECCV.

[23]  Li Cheng,et al.  Efficient Hand Pose Estimation from a Single Depth Image , 2013, 2013 IEEE International Conference on Computer Vision.

[24]  Yaser Sheikh,et al.  Bilinear spatiotemporal basis models , 2012, TOGS.

[25]  Tae-Kyun Kim,et al.  Real-Time Articulated Hand Pose Estimation Using Semi-supervised Transductive Regression Forests , 2013, 2013 IEEE International Conference on Computer Vision.

[26]  Chen Qian,et al.  Realtime and Robust Hand Tracking from Depth , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Mircea Nicolescu,et al.  Vision-based hand pose estimation: A review , 2007, Comput. Vis. Image Underst..

[28]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[29]  John C. Hart,et al.  The CAVE: audio visual experience automatic virtual environment , 1992, CACM.

[30]  Fernando De la Torre,et al.  Spatio-temporal Matching for Human Detection in Video , 2014, ECCV.

[31]  Haibo Li,et al.  Direct hand pose estimation for immersive gestural interaction , 2015, Pattern Recognit. Lett..