Image Classification from Small Sample, with Distance Learning and Feature Selection

Small sample is an acute problem in many application domains, which may be partially addressed by feature selection or dimensionality reduction. For the purpose of distance learning, we describe a method for feature selection using equivalence constraints between pairs of datapoints. The method is based on L1 regularization and optimization. Feature selection is then incorporated into an existing nonparametric method for distance learning, which is based on the boosting of constrained generative models. Thus the final algorithm employs dynamical feature selection, where features are selected anew in each boosting iteration based on the weighted training data. We tested our algorithm on the classification of facial images, using two public domain databases. We show the results of extensive experiments where our method performed much better than a number of competing methods, including the original boosting-based distance learning method and two commonly used Mahalanobis metrics.

[1]  Tomer Hertz,et al.  Boosting margin based distance functions for clustering , 2004, ICML.

[2]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[4]  Erik G. Learned-Miller,et al.  Building a classification cascade for visual identification from one example , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[5]  Antonio Torralba,et al.  Learning hierarchical models of scenes, objects, and parts , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[6]  George Kollios,et al.  BoostMap: A method for efficient approximate similarity rankings , 2004, CVPR 2004.

[7]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[8]  Michael I. Jordan,et al.  Statistical Debugging of Sampled Programs , 2003, NIPS.

[9]  David J. Kriegman,et al.  From few to many: generative models for recognition under variable pose and illumination , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[10]  Daphna Weinshall,et al.  Learning a kernel function for classification with small training samples , 2006, ICML.

[11]  Nello Cristianini,et al.  Efficiently Learning the Metric with Side-Information , 2003, ALT.

[12]  Alexey Tsymbal,et al.  Ensemble Feature Selection with Dynamic Integration of Classifiers , 2001 .

[13]  Shimon Ullman,et al.  Cross-generalization: learning novel classes from a single example by feature replacement , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Mads Nielsen,et al.  Computer Vision — ECCV 2002 , 2002, Lecture Notes in Computer Science.

[15]  Hong Chang,et al.  Locally linear metric adaptation for semi-supervised clustering , 2004, ICML.

[16]  Tomer Hertz,et al.  Computing Gaussian Mixture Models with EM Using Equivalence Constraints , 2003, NIPS.

[17]  Michael Elad,et al.  Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[18]  A. Ng Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.

[19]  Misha Pavel,et al.  Adjustment Learning and Relevant Component Analysis , 2002, ECCV.

[20]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.