Learning Hierarchical Features Using Sparse Self-organizing Map Coding for Image Classification

Image descriptor is a critical issue for most image classification problems. Low-level image descriptors based on Gabor filters, SIFT and HOG features have exhibited good image representation for many applications. However, these descriptors are not appropriate for image classification and need to be converted into appropriate representations. This process can be performed by applying two operations of coding and pooling. In coding operation, an appropriate codebook is learned and better adapted from training data, while pooling operation summarizes the coded features over larger regions. Several methods of coding and pooling schemes have been proposed in the literature. In this paper, self-organizing map (SOM) is employed to learn a topologically adapted codebook instead of the well-known k-means algorithm. In addition, a new non-negative sparse coding technique for SOM is also proposed and tested. The new sparse coding method utilizes Non-negative Least Squares (NNLS) optimization to best reconstruct every input pattern using K-best-matching codewords. Experimental results using Caltech-101 database show the effectiveness of the proposed method compared with other state-of-the-art methods.

[1]  Jürgen Schmidhuber,et al.  Multi-column deep neural network for traffic sign classification , 2012, Neural Networks.

[2]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[4]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[5]  CireşAnDan,et al.  2012 Special Issue , 2012 .

[6]  Teuvo Kohonen,et al.  Essentials of the self-organizing map , 2013, Neural Networks.

[7]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[8]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[9]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Atsushi Shimada,et al.  Robust Face Recognition Using Multiple Self-Organized Gabor Features and Local Similarity Matching , 2010, 2010 20th International Conference on Pattern Recognition.

[11]  Atsushi Shimada,et al.  Visual feature extraction using variable map-dimension Hypercolumn Model , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[12]  Andrew Y. Ng,et al.  Emergence of Object-Selective Features in Unsupervised Feature Learning , 2012, NIPS.

[13]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  Saleh Aly Learning invariant local image descriptor using convolutional Mahalanobis self-organising map , 2014, Neurocomputing.

[16]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[17]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[18]  Andrew Y. Ng,et al.  Learning Feature Representations with K-Means , 2012, Neural Networks: Tricks of the Trade.

[19]  Andrew Y. Ng,et al.  The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization , 2011, ICML.

[20]  Jean Ponce,et al.  Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.