A Collaborative Filtering Approach to Real-Time Hand Pose Estimation

Collaborative filtering aims to predict unknown user ratings in a recommender system by collectively assessing known user preferences. In this paper, we first draw analogies between collaborative filtering and the pose estimation problem. Specifically, we recast the hand pose estimation problem as the cold-start problem for a new user with unknown item ratings in a recommender system. Inspired by fast and accurate matrix factorization techniques for collaborative filtering, we develop a real-time algorithm for estimating the hand pose from RGB-D data of a commercial depth camera. First, we efficiently identify nearest neighbors using local shape descriptors in the RGB-D domain from a library of hand poses with known pose parameter values. We then use this information to evaluate the unknown pose parameters using a joint matrix factorization and completion (JMFC) approach. Our quantitative and qualitative results suggest that our approach is robust to variation in hand configurations while achieving real time performance (≈ 29 FPS) on a standard computer.

[1]  Thomas M Greiner,et al.  Hand Anthropometry of U.S. Army Personnel , 1991 .

[2]  Ronen Basri,et al.  Direct visibility of point sets , 2007, ACM Trans. Graph..

[3]  Tae-Kyun Kim,et al.  Latent Regression Forest: Structured Estimation of 3D Articulated Hand Posture , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Tae-Kyun Kim,et al.  Real-Time Articulated Hand Pose Estimation Using Semi-supervised Transductive Regression Forests , 2013, 2013 IEEE International Conference on Computer Vision.

[5]  Jae Kyeong Kim,et al.  A literature review and classification of recommender systems research , 2012, Expert Syst. Appl..

[6]  Tamara G. Kolda,et al.  All-at-once Optimization for Coupled Matrix and Tensor Factorizations , 2011, ArXiv.

[7]  Ying Wu,et al.  Modeling the constraints of human hand motion , 2000, Proceedings Workshop on Human Motion.

[8]  Tosiyasu L. Kunii,et al.  Model-based analysis of hand posture , 1995, IEEE Computer Graphics and Applications.

[9]  Ken Perlin,et al.  Real-Time Continuous Pose Recovery of Human Hands Using Convolutional Networks , 2014, ACM Trans. Graph..

[10]  Keiichi Abe,et al.  Topological structural analysis of digitized binary images by border following , 1985, Comput. Vis. Graph. Image Process..

[11]  Lale Akarun,et al.  Hand Pose Estimation and Hand Shape Classification Using Multi-layered Randomized Decision Forests , 2012, ECCV.

[12]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[13]  Jian Sun,et al.  Cascaded hand pose regression , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Yehuda Koren,et al.  Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[15]  Sylvain Paris,et al.  6D hands: markerless hand-tracking for computer aided design , 2011, UIST.

[16]  John R. Anderson,et al.  Human memory: An adaptive perspective. , 1989 .

[17]  C. D. Meyer,et al.  Initializations for the Nonnegative Matrix Factorization , 2006 .

[18]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[19]  Sean M. McNee,et al.  Getting to know you: learning new user preferences in recommender systems , 2002, IUI '02.

[20]  Antonis A. Argyros,et al.  Efficient model-based 3D tracking of hand articulations using Kinect , 2011, BMVC.

[21]  Lale Akarun,et al.  Real time hand pose estimation using depth sensors , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[22]  Tom Drummond,et al.  Fusing points and lines for high performance tracking , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[23]  Jure Leskovec,et al.  Hidden factors and hidden topics: understanding rating dimensions with review text , 2013, RecSys.

[24]  Li Cheng,et al.  Efficient Hand Pose Estimation from a Single Depth Image , 2013, 2013 IEEE International Conference on Computer Vision.

[25]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[26]  Thomas S. Huang,et al.  A fast two-dimensional median filtering algorithm , 1979 .

[27]  Mircea Nicolescu,et al.  Vision-based hand pose estimation: A review , 2007, Comput. Vis. Image Underst..

[28]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[29]  Sterling Orsten,et al.  Dynamics based 3D skeletal hand tracking , 2013, I3D '13.

[30]  John Riedl,et al.  Learning preferences of new users in recommender systems: an information theoretic approach , 2008, SKDD.

[31]  John Riedl,et al.  An Algorithmic Framework for Performing Collaborative Filtering , 1999, SIGIR Forum.

[32]  Andrew W. Fitzgibbon,et al.  Accurate, Robust, and Flexible Real-time Hand Tracking , 2015, CHI.

[33]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[34]  W. Cleveland,et al.  Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .

[35]  Björn Stenger,et al.  Multivariate Relevance Vector Machines for Tracking , 2006, ECCV.

[36]  J. H. Choi,et al.  DFacTo: Distributed Factorization of Tensors , 2014, NIPS.