Hand gesture recognition using Bag-of-features and multi-class Support Vector Machine

This paper discusses the use of the Scale Invariance Feature Transform (SIFT) features for bare hand gesture recognition. In the training stage, we can not use SIFT keypoints of training images directly with a multi-class Support Vector Machine (SVM) to build a training classifier model, because of the space incompatibility of the SIFT keypoints for every training image that contains the hand gesture only. Therefore, the Bag-of-features model was introduced. After extracting the keypoints for every training image using the SIFT algorithm, a vector quantization technique is used to unify them. The quantization will map keypoints extracted from every training image into a unified dimensional histogram vector (Bag-of-words) after K-means clustering. This histogram is treated as an input vector for a multi-class SVM to build the training classifier model. In the testing stage, the keypoints are extracted from every image captured from the webcam and fed into the cluster model to map them with one (Bag-of-words) vector, which is finally fed into the multi-class SVM training classifier model to recognize the hand gesture.

[1]  Luo Juan,et al.  A comparison of SIFT, PCA-SIFT and SURF , 2009 .

[2]  Chong-Wah Ngo,et al.  Keyframe Retrieval by Keypoints: Can Point-to-Point Matching Help? , 2006, CIVR.

[3]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[4]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[5]  Thomas S. Huang,et al.  Tracking articulated hand motion with eigen dynamics analysis , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6]  Jason Weston,et al.  Multi-Class Support Vector Machines , 1998 .

[7]  Emil M. Petriu,et al.  A Prototype for 3-D Hand Tracking and Posture Estimation , 2008, IEEE Transactions on Instrumentation and Measurement.

[8]  Andre L. C. Barczak,et al.  Real-time hand tracking using a set of cooperative classifiers based on Haar-like features , 2005 .

[9]  Cristina Picus,et al.  Framework for a portable gesture interface , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[10]  Björn Stenger,et al.  Template-Based Hand Pose Recognition Using Multiple Cues , 2006, ACCV.

[11]  Manolis I. A. Lourakis,et al.  Vision-Based Interpretation of Hand Gestures for Remote Control of a Computer Mouse , 2006, ECCV Workshop on HCI.

[12]  Lars Bretzner,et al.  Hand gesture recognition using multi-scale colour features, hierarchical models and particle filtering , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[13]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14]  Chong-Wah Ngo,et al.  Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.

[15]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[16]  Robert Marti,et al.  Which is the best way to organize/classify images by content? , 2007, Image Vis. Comput..

[17]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[18]  Mathias Kölsch,et al.  Analysis of rotational robustness of hand detection with a Viola-Jones detector , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[19]  N.D. Georganas,et al.  Real-time Vision-based Hand Gesture Recognition Using Haar-like Features , 2007, 2007 IEEE Instrumentation & Measurement Technology Conference IMTC 2007.

[20]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[21]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[22]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[23]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[24]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.