Max-pooling convolutional neural networks for vision-based hand gesture recognition

Automatic recognition of gestures using computer vision is important for many real-world applications such as sign language recognition and human-robot interaction (HRI). Our goal is a real-time hand gesture-based HRI interface for mobile robots. We use a state-of-the-art big and deep neural network (NN) combining convolution and max-pooling (MPCNN) for supervised feature learning and classification of hand gestures given by humans to mobile robots using colored gloves. The hand contour is retrieved by color segmentation, then smoothened by morphological image processing which eliminates noisy edges. Our big and deep MPCNN classifies 6 gesture classes with 96% accuracy, nearly three times better than the nearest competitor. Experiments with mobile robots using an ARM 11 533MHz processor achieve real-time gesture recognition performance.

[1]  D. Hubel,et al.  Receptive fields of single neurones in the cat's striate cortex , 1959, The Journal of physiology.

[2]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[3]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[4]  Claus Nebauer,et al.  Evaluation of convolutional neural networks for visual recognition , 1998, IEEE Trans. Neural Networks.

[5]  Ying Wu,et al.  Vision-Based Gesture Recognition: A Review , 1999, Gesture Workshop.

[6]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[7]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  E. Ozgur,et al.  A Fast Algorithm for Vision-Based Hand Gesture Recognition for Robot Control , 2006, 2006 IEEE 14th Signal Processing and Communications Applications.

[9]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Sven J. Dickinson,et al.  Canonical Skeletons for Shape Matching , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[11]  Xing Zhu,et al.  Hand Posture Recognition in Gesture-Based Human-Robot Interaction , 2006, 2006 1ST IEEE Conference on Industrial Electronics and Applications.

[12]  Xiangbin Zhu,et al.  Shape Recognition Based on Skeleton and Support Vector Machines , 2007, ICIC.

[13]  Haruhisa Kawasaki,et al.  Design and Control of Five-Fingered Haptic Interface Opposite to Human Hand , 2007, IEEE Transactions on Robotics.

[14]  Hong Yang,et al.  An Improved Method of Wavelets Basis Image Denoising Using Besov Norm Regularization , 2007, Fourth International Conference on Image and Graphics (ICIG 2007).

[15]  Ming Xie,et al.  Finger identification and hand posture recognition for human-robot interaction , 2007, Image Vis. Comput..

[16]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[17]  Hanqing Lu,et al.  Hand Gesture Recognition Using Fast Multi-scale Analysis , 2007, Fourth International Conference on Image and Graphics (ICIG 2007).

[18]  Stan Sclaroff,et al.  Translation and scale-invariant gesture recognition in complex scenes , 2008, PETRA '08.

[19]  Yap Vooi Voon,et al.  The effect of colour space on tracking robustness , 2008, 2008 3rd IEEE Conference on Industrial Electronics and Applications.

[20]  Yu Sun,et al.  Static Hand Gesture Recognition and its Application based on Support Vector Machines , 2008, 2008 Ninth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing.

[21]  Fengming Zhang,et al.  Hand Gesture Recognition Based on MEB-SVM , 2009, 2009 International Conference on Embedded Software and Systems.

[22]  R. S. Jadon,et al.  A REVIEW OF VISION BASED HAND GESTURES RECOGNITION , 2009 .

[23]  Zhang Peng,et al.  An Automatic Hand Gesture Recognition System Based on Viola-Jones Method and SVMs , 2009, 2009 Second International Workshop on Computer Science and Engineering.

[24]  Sanjeev Sofat,et al.  Vision Based Hand Gesture Recognition , 2009 .

[25]  R.S. Choras,et al.  Hand shape and hand gesture recognition , 2009, 2009 IEEE Symposium on Industrial Electronics & Applications.

[26]  C. Harshith,et al.  Survey on Various Gesture Recognition Techniques for Interfacing Machines Based on Ambient Intelligence , 2010, ArXiv.

[27]  Aldo von Wangenheim,et al.  Comparative evaluation of static gesture recognition techniques based on nearest neighbor, neural networks and support vector machines , 2010, Journal of the Brazilian Computer Society.

[28]  Quoc V. Le,et al.  Tiled convolutional neural networks , 2010, NIPS.

[29]  C. Chakraborty,et al.  Invariant moment based feature analysis for abnormal erythrocyte recognition , 2010, 2010 International Conference on Systems in Medicine and Biology.

[30]  Yihong Gong,et al.  Human Tracking Using Convolutional Neural Networks , 2010, IEEE Transactions on Neural Networks.

[31]  Sven Behnke,et al.  Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition , 2010, ICANN.

[32]  Sun Bei,et al.  Research on Image Recognition Based on Invariant Moment and SVM , 2010, 2010 First International Conference on Pervasive Computing, Signal Processing and Applications.

[33]  Yann LeCun,et al.  Convolutional networks and applications in vision , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[34]  Sieh Kiong Tiong,et al.  Nontechnical Loss Detection for Metered Customers in Power Utility Using Support Vector Machines , 2010, IEEE Transactions on Power Delivery.

[35]  Hanqing Lu,et al.  A real-time hand gesture recognition method , 2007, 2011 International Conference on Electronics, Communications and Control (ICECC).

[36]  Yael Edan,et al.  Vision-based hand-gesture applications , 2011, Commun. ACM.

[37]  Luca Maria Gambardella,et al.  Flexible, High Performance Convolutional Neural Networks for Image Classification , 2011, IJCAI.

[38]  Marco Winzker,et al.  2011 IEEE International Conference on Signal and Image Processing Applications, ICSIPA 2011, Kuala Lumpur, Malaysia, November 16-18, 2011 , 2011, International Conference on Signal and Image Processing Applications.

[39]  Eliseo Ferrante,et al.  Swarmanoid: A Novel Concept for the Study of Heterogeneous Robotic Swarms , 2013, IEEE Robotics & Automation Magazine.

[40]  Ankit Chaudhary,et al.  Intelligent Approaches to interact with Machines using Hand Gesture Recognition in Natural way: A Survey , 2011, ArXiv.