Hand gesture recognition using view projection from point cloud

In this paper we propose a multi-view method to recognize hand gestures using point cloud. The main idea of this paper is to project point cloud into view images and hand gestures are described by extracting and fusing features in view images. The conversion of feature space increases the inner-class similarity and meanwhile reduces the inter-class similarity. The features of view images are extracted in parallel so the scale of each feature extractor can be reduced to converge easily. In our method we perform a refined hand segmentation to segment hand form background firstly. Then the segmented hand point cloud is projected into different view planes to form view images. Next we use convolutional neural networks as feature extractors to extract features of view images. The extracted view image features are fused to form the features of hand gestures. Finally a SVM is trained for hand gesture recognition. The experimental results show that our multiview method achieves higher recognition rate and more robust to the challenging rotation changes especially out-plane rotations.

[1]  Luca Maria Gambardella,et al.  Max-pooling convolutional neural networks for vision-based hand gesture recognition , 2011, 2011 IEEE International Conference on Signal and Image Processing Applications (ICSIPA).

[2]  Lei Yang,et al.  Static Hand Gesture Recognition Based on HOG with Kinect , 2012, 2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics.

[3]  Junsong Yuan,et al.  Robust hand gesture recognition based on finger-earth mover's distance with a commodity depth camera , 2011, ACM Multimedia.

[4]  Yonghong Song,et al.  Real Time Fingertip Detection with Kinect Depth Image Sequences , 2014, 2014 22nd International Conference on Pattern Recognition.

[5]  John C. Platt,et al.  A Convolutional Neural Network Hand Tracker , 1994, NIPS.

[6]  David W. Murray,et al.  Regression-based Hand Pose Estimation from Multiple Cameras , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Yonghong Song,et al.  Real-time fingertip detection based on depth data , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[8]  Li Cheng,et al.  Efficient Hand Pose Estimation from a Single Depth Image , 2013, 2013 IEEE International Conference on Computer Vision.

[9]  Ken Perlin,et al.  Real-Time Continuous Pose Recovery of Human Hands Using Convolutional Networks , 2014, ACM Trans. Graph..

[10]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[11]  Yi Li,et al.  Hand gesture recognition using Kinect , 2012, 2012 IEEE International Conference on Computer Science and Automation Engineering.

[12]  Nicolas D. Georganas,et al.  Real-Time Hand Gesture Detection and Recognition Using Bag-of-Features and Support Vector Machine Techniques , 2011, IEEE Transactions on Instrumentation and Measurement.

[13]  Yoichi Sato,et al.  Real-Time Fingertip Tracking and Gesture Recognition , 2002, IEEE Computer Graphics and Applications.

[14]  Stan Sclaroff,et al.  Estimating 3D hand pose from a cluttered image , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..