Image Classification by Selective Regularized Subspace Learning

Feature learning is an intensively studied research topic in image classification. Although existing methods like sparse coding, locality-constrained linear coding, fisher vector encoding, etc., have shown their effectiveness in image representation, most of them overlook a phenomenon called thesmall sample size problem, where the number of training samples is relatively smaller than the dimensionality of the features, which may limit the predictive power of the classifier. Subspace learning is a strategy to mitigate this problem by reducing the dimensionality of the features. However, most conventional subspace learning methods attempt to learn a global subspace to discriminate all the classes, which proves to be difficult and ineffective in multi-class classification task. To this end, we propose to learn a local subspace for each sample instead of learning a global subspace for all samples. Our key observation is that, in multi-class image classification, the label of each testing sample is only confused by a few classes which have very similar visual appearance to it. Thus, in this work, we propose a coarse-to-fine strategy, which first picks out such classes, and then conducts a local subspace learning to discriminate them. As the subspace learning method is regularized and conducted within some selected classes, we term it selective regularized subspace learning (SRSL), and we term our classification pipeline selective regularized subspace learning based multi-class image classification (SRSL_MIC). Experimental results on four representative datasets (Caltech-101, Indoor-67, ORL Faces and AR Faces) demonstrate the effectiveness of the proposed method.

[1]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2]  Fereshteh Sadeghi,et al.  Latent Pyramidal Regions for Recognizing Scenes , 2012, ECCV.

[3]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Jing Huang,et al.  An automatic hierarchical image classification scheme , 1998, MULTIMEDIA '98.

[5]  Pascal Frossard,et al.  Semantic Coding by Supervised Dimensionality Reduction , 2008, IEEE Transactions on Multimedia.

[6]  Avinash C. Kak,et al.  PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Peter N. Belhumeur,et al.  POOF: Part-Based One-vs.-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Thomas Mensink,et al.  Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.

[10]  Svetlana Lazebnik,et al.  Scene recognition and weakly supervised object localization with deformable part-based models , 2011, 2011 International Conference on Computer Vision.

[11]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[12]  Andrew Zisserman,et al.  On-the-fly learning for visual search of large-scale image and video datasets , 2015, International Journal of Multimedia Information Retrieval.

[13]  Thomas S. Huang Locally Linear Embedded Eigenspace Analysis , 2005 .

[14]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  Jieping Ye,et al.  An optimization criterion for generalized discriminant analysis on undersampled problems , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[17]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[18]  Lei Wang,et al.  An Adaptive Approach to Learning Optimal Neighborhood Kernels , 2013, IEEE Transactions on Cybernetics.

[19]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[20]  Forrest N. Iandola,et al.  Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction , 2013, 2013 IEEE International Conference on Computer Vision.

[21]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[22]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Shuicheng Yan,et al.  Neighborhood preserving embedding , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[24]  Sukumar Bandopadhyay,et al.  An Objective Analysis of Support Vector Machine Based Classification for Remote Sensing , 2008 .

[25]  Adrian Popescu,et al.  Large-Scale Image Mining with Flickr Groups , 2015, MMM.

[26]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[27]  Stefan Poslad,et al.  An Enhanced Bag-of-Visual Word Vector Space Model to Represent Visual Content in Athletics Images , 2012, IEEE Transactions on Multimedia.

[28]  Krishnakumar Balasubramanian,et al.  Smooth sparse coding via marginal regression for learning sparse representations , 2012, Artif. Intell..

[29]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[30]  Aleix M. Martinez,et al.  The AR face database , 1998 .

[31]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[32]  Pascal Vincent,et al.  K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms , 2001, NIPS.

[33]  Hamid R. Rabiee,et al.  From Local Similarity to Global Coding: An Application to Image Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[35]  Jitendra Malik,et al.  SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[36]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[37]  Chong-Wah Ngo,et al.  Representations of Keypoint-Based Semantic Concept Detection: A Comprehensive Study , 2010, IEEE Transactions on Multimedia.

[38]  Lei Zhang,et al.  Sparse representation or collaborative representation: Which helps face recognition? , 2011, 2011 International Conference on Computer Vision.

[39]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[40]  David A. Landgrebe,et al.  The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon , 1994, IEEE Trans. Geosci. Remote. Sens..

[41]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[42]  Alex A. Freitas,et al.  A survey of hierarchical classification across different application domains , 2010, Data Mining and Knowledge Discovery.

[43]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[44]  Alexei A. Efros,et al.  Unsupervised Discovery of Mid-Level Discriminative Patches , 2012, ECCV.

[45]  Adnan Yazici,et al.  Towards Effective Image Classification Using Class-Specific Codebooks and Distinctive Local Features , 2015, IEEE Transactions on Multimedia.

[46]  Yihong Gong,et al.  Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.

[47]  Xiaojun Wu,et al.  A complete fuzzy discriminant analysis approach for face recognition , 2010, Appl. Soft Comput..

[48]  Winston H. Hsu,et al.  Scalable Mobile Visual Classification by Kernel Preserving Projection Over High-Dimensional Features , 2014, IEEE Transactions on Multimedia.

[49]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[50]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Alex Pentland,et al.  Face recognition using eigenfaces , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[52]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[53]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[54]  Hanqing Lu,et al.  Solving the small sample size problem of LDA , 2002, Object recognition supported by user interaction for service robots.

[55]  Prateek Jain,et al.  Fast image search for learned metrics , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, CVPR.

[57]  Qi Tian,et al.  Discriminant Subspace Analysis: An Adaptive Approach for Image Classification , 2009, IEEE Transactions on Multimedia.

[58]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[59]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[60]  Ja-Chen Lin,et al.  A new LDA-based face recognition system which can solve the small sample size problem , 1998, Pattern Recognit..

[61]  Hwann-Tzong Chen,et al.  Local discriminant embedding and its variants , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[62]  Jianping Fan,et al.  Mining Multilevel Image Semantics via Hierarchical Classification , 2008, IEEE Transactions on Multimedia.

[63]  Dimitrios Gunopulos,et al.  Adaptive Nearest Neighbor Classification Using Support Vector Machines , 2001, NIPS.

[64]  Pietro Perona,et al.  A sparse object category model for efficient learning and exhaustive recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[65]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[66]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[67]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[68]  Hao Su,et al.  Objects as Attributes for Scene Classification , 2010, ECCV Workshops.

[69]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[70]  Dumitru Erhan,et al.  Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[71]  Ahmet Sertbas,et al.  Evaluation of face recognition techniques using PCA, wavelets and SVM , 2010, Expert Syst. Appl..

[72]  Jiawei Han,et al.  Semi-supervised Discriminant Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[73]  C. V. Jawahar,et al.  Blocks That Shout: Distinctive Parts for Scene Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[74]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .