The Dual Codebook : Combining Bags of Visual Words in Image Classification

In this paper, we evaluate the performance of two conventional bag of words approaches, using two basic local feature descriptors, to perform image classification. These approaches are compared to a novel design which combines two bags of visual words, using two different feature descriptors. The system extends earlier work wherein a bag of visual words approach with an L2 support vector machine classifier outperforms several alternatives. The descriptors we test are raw pixel intensities and the Histogram of Oriented Gradients. Using a novel Primal Support Vector Machine as a classifier, we perform image classification on the CIFAR-10 and MNIST datasets. Results show that the dual codebook implementation successfully utilizes the potential contributive information encapsulated by an alternative feature descriptor and increases performance, improving classification by 5-18% on CIFAR-10, and 0.22-1.03% for MNIST compared to the simple bag of words approaches.

[1]  Hamed Kiani Galoogahi,et al.  Shape Classification Using Local and Global Features , 2010, 2010 Fourth Pacific-Rim Symposium on Image and Video Technology.

[2]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[3]  Kazuhiko Takahashi,et al.  Remarks on Computational Facial Expression Recognition from HOG Features Using Quaternion Multi-layer Neural Network , 2014, EANN.

[4]  Qihao Weng,et al.  A survey of image classification methods and techniques for improving classification performance , 2007 .

[5]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[6]  Tong Heng Lee,et al.  Shape classification using invariant features and contextual information in the bag-of-words model , 2015, Pattern Recognit..

[7]  Benjamin Graham,et al.  Fractional Max-Pooling , 2014, ArXiv.

[8]  Lambert Schomaker,et al.  Recognizing Handwritten Characters with Local Descriptors and Bags of Visual Words , 2015, EANN.

[9]  Wenjie Chen,et al.  Image classification based on support vector machine and the fusion of complementary features , 2015, ArXiv.

[10]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[11]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[12]  Lambert Schomaker,et al.  Robust Face Identification with Small Sample Sizes using Bag of Words and Histogram of Oriented Gradients , 2016, VISIGRAPP.

[13]  Lambert Schomaker,et al.  A Comparison of Feature and Pixel-Based Methods for Recognizing Handwritten Bangla Digits , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[14]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[15]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Luis Salgado,et al.  Image-based on-road vehicle detection using cost-effective Histograms of Oriented Gradients , 2013, J. Vis. Commun. Image Represent..

[17]  Andrew Y. Ng,et al.  Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning , 2011, 2011 International Conference on Document Analysis and Recognition.