A hand gesture recognition system based on canonical superpixel-graph

Abstract This paper presents a new hand gesture recognition system based on a novel canonical superpixel-graph earth mover’s distance (CSG-EMD) metric. It aims to improve the performance of the superpixel earth mover’s distance (SP-EMD), a recently proposed distance metric designed for depth-based hand gesture recognition. In real life, people have their own habits while performing certain hand gestures, which yields a variety of hand shapes with different finger poses. Such variety may affect the accuracy of SP-EMD and hence will degrade its performance. In this paper, we propose a new distance metric CSG-EMD to alleviate the problem. Scattered superpixels are organized in the form of canonical superpixel-graph which can factor out non-standard finger poses, resulting a well-structured finger-pose-neutral shape representation for hand gestures. Moreover, a structure stress based fusion scheme is applied to formulate the proposed distance metric, i.e. CSG-EMD, for gesture recognition. Experimental results on five public gesture datasets show that the proposed CSG-EMD-based system can achieve better recognition accuracy than other state-of-the-art algorithms compared. Its superiority is further demonstrated by two real-life applications.

[1]  Dehui Kong,et al.  Similarity Assessment Model for Chinese Sign Language Videos , 2014, IEEE Transactions on Multimedia.

[2]  Xiaodong Yang,et al.  Histogram of 3D Facets: A characteristic descriptor for hand gesture recognition , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[3]  Fernando Jaureguizar,et al.  Temporal Pyramid Matching of Local Binary Subpatterns for Hand-Gesture Recognition , 2016, IEEE Signal Processing Letters.

[4]  Jing Li,et al.  An Integrative Framework of Human Hand Gesture Segmentation for Human–Robot Interaction , 2017, IEEE Systems Journal.

[5]  Haibin Ling,et al.  Shape Classification Using the Inner-Distance , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Ralph R. Martin,et al.  Euclidean-distance-based canonical forms for non-rigid 3D shape retrieval , 2015, Pattern Recognit..

[7]  Ana-Maria Cretu,et al.  Static and Dynamic Hand Gesture Recognition in Depth Data Using Dynamic Time Warping , 2016, IEEE Transactions on Instrumentation and Measurement.

[8]  Honghai Liu,et al.  A novel approach to extract hand gesture feature in depth images , 2016, Multimedia Tools and Applications.

[9]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[10]  Pietro Zanuttigh,et al.  Feature Descriptors for Depth-Based Hand Gesture Recognition , 2014 .

[11]  Ralph R. Martin,et al.  Canonical Forms for Non-Rigid 3D Shape Retrieval , 2015, 3DOR@Eurographics.

[12]  Martin Saerbeck,et al.  Recent methods and databases in vision-based hand gesture recognition: A review , 2015, Comput. Vis. Image Underst..

[13]  Chong Wang,et al.  Superpixel-Based Hand Gesture Recognition With Kinect Depth Camera , 2015, IEEE Transactions on Multimedia.

[14]  Zhengqin Li,et al.  Superpixel segmentation using Linear Spectral Clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Jaehoon Kim,et al.  Effects of Auditory Feedback on Menu Selection in Hand-Gesture Interfaces , 2015, IEEE MultiMedia.

[16]  Pietro Zanuttigh,et al.  Hand gesture recognition with jointly calibrated Leap Motion and depth sensor , 2015, Multimedia Tools and Applications.

[17]  Clementine Nyirarugira,et al.  Stratified gesture recognition using the normalized longest common subsequence with rough sets , 2015, Signal Process. Image Commun..

[18]  Pietro Zanuttigh,et al.  Hand gesture recognition with leap motion and kinect devices , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[19]  Junwei Han,et al.  Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[20]  Craig Gotsman,et al.  Characterizing Shape Using Conformal Factors , 2008, 3DOR@Eurographics.

[21]  J A Sethian,et al.  Computing geodesic paths on manifolds. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[22]  P. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 1999 .

[23]  Thad Starner,et al.  American sign language recognition with the kinect , 2011, ICMI '11.

[24]  Ling Shao,et al.  Enhanced Computer Vision With Microsoft Kinect Sensor: A Review , 2013, IEEE Transactions on Cybernetics.

[25]  Xueming Qian,et al.  Semantic Annotation of High-Resolution Satellite Images via Weakly Supervised Learning , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[26]  Longin Jan Latecki,et al.  Path Similarity Skeleton Graph Matching , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Nathan Jacobs,et al.  Real Time Gesture Recognition With 2 Kinect Sensors , 2012 .

[28]  Hong Cheng,et al.  An image-to-class dynamic time warping approach for both 3D static and trajectory hand gesture recognition , 2016, Pattern Recognit..

[29]  Yue Zhang,et al.  A Fast Superpixel Segmentation Algorithm for PolSAR Images Based on Edge Refinement and Revised Wishart Distance , 2016, Sensors.

[30]  Wenxiong Kang,et al.  Robust Fingertip Detection in a Complex Environment , 2016, IEEE Transactions on Multimedia.

[31]  Chong Wang,et al.  Hand gesture recognition based on canonical formed superpixel earth mover's distance , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).

[32]  Anupam Agrawal,et al.  Vision based hand gesture recognition for human computer interaction: a survey , 2012, Artificial Intelligence Review.

[33]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[34]  Wenyu Liu,et al.  Skeleton Pruning by Contour Partitioning with Discrete Curve Evolution , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Longin Jan Latecki,et al.  Discrete Skeleton Evolution , 2007, EMMCVPR.

[36]  Lu Yang,et al.  Survey on 3D Hand Gesture Recognition , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[37]  S. Mitra,et al.  Gesture Recognition: A Survey , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[38]  Pietro Zanuttigh,et al.  Head-mounted gesture controlled interface for human-computer interaction , 2016, Multimedia Tools and Applications.

[39]  Junsong Yuan,et al.  Robust Part-Based Hand Gesture Recognition Using Kinect Sensor , 2013, IEEE Transactions on Multimedia.

[40]  Jihun Cha,et al.  Multi-modal user interaction method based on gaze tracking and gesture recognition , 2013, Signal Process. Image Commun..

[41]  Ron Kimmel,et al.  On Bending Invariant Signatures for Surfaces , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[42]  Yang Gao,et al.  Multi-layered gesture recognition with Kinect , 2015, J. Mach. Learn. Res..

[43]  Mauro Donadeo,et al.  Combining multiple depth-based descriptors for hand gesture recognition , 2014, Pattern Recognit. Lett..

[44]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..