Shared Feature Extraction for Nearest Neighbor Face Recognition

In this paper, we propose a new supervised linear feature extraction technique for multiclass classification problems that is specially suited to the nearest neighbor classifier (NN). The problem of finding the optimal linear projection matrix is defined as a classification problem and the Adaboost algorithm is used to compute it in an iterative way. This strategy allows the introduction of a multitask learning (MTL) criterion in the method and results in a solution that makes no assumptions about the data distribution and that is specially appropriated to solve the small sample size problem. The performance of the method is illustrated by an application to the face recognition problem. The experiments show that the representation obtained following the multitask approach improves the classic feature extraction algorithms when using the NN classifier, especially when we have a few examples from each class.

[1]  Jianlin Wang,et al.  Solving the small sample size problem in face recognition using generalized discriminant analysis , 2006, Pattern Recognit..

[2]  Jonathan Baxter,et al.  Learning internal representations , 1995, COLT '95.

[3]  Antonio Artés-Rodríguez,et al.  Maximization of Mutual Information for Supervised Linear Feature Extraction , 2007, IEEE Transactions on Neural Networks.

[4]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  David Masip,et al.  Boosted discriminant projections for nearest neighbor classification , 2006, Pattern Recognit..

[6]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Anastasios Tefas,et al.  Exploiting discriminant information in nonnegative matrix factorization with application to frontal face verification , 2006, IEEE Transactions on Neural Networks.

[8]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[9]  T. Poggio,et al.  Recognition and Structure from one 2D Model View: Observations on Prototypes, Object Classes and Symmetries , 1992 .

[10]  Nathan Intrator,et al.  Making a Low-dimensional Representation Suitable for Diverse Tasks , 1996, Connect. Sci..

[11]  Yair Weiss,et al.  Learning From a Small Number of Training Examples by Exploiting Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[12]  Konstantinos N. Plataniotis,et al.  Face recognition using kernel direct discriminant analysis algorithms , 2003, IEEE Trans. Neural Networks.

[13]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[14]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[15]  Witold Pedrycz,et al.  Face recognition: A study in information fusion using fuzzy integral , 2005, Pattern Recognit. Lett..

[16]  Albert Pujol Torras Contributions to shape and texture face similarity measurement , 2001 .

[17]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[18]  Hanqing Lu,et al.  Solving the small sample size problem of LDA , 2002, Object recognition supported by user interaction for service robots.

[19]  Nikhil R. Pal,et al.  Two efficient connectionist schemes for structure preserving dimensionality reduction , 1998, IEEE Trans. Neural Networks.

[20]  Michel Verleysen,et al.  DD-HDS: A Method for Visualization and Exploration of High-Dimensional Data , 2007, IEEE Transactions on Neural Networks.

[21]  Tony Jebara,et al.  Multi-task feature and kernel selection for SVMs , 2004, ICML.

[22]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[23]  David Beymer,et al.  Face recognition from one example view , 1995, Proceedings of IEEE International Conference on Computer Vision.

[24]  J. Friedman Exploratory Projection Pursuit , 1987 .

[25]  B. V. K. Vijaya Kumar,et al.  Representational oriented component analysis (ROCA) for face recognition with one sample image per training class , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  Zhi-Hua Zhou,et al.  Face recognition from a single image per person: A survey , 2006, Pattern Recognit..

[27]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[28]  Anastasios Tefas,et al.  Weighted Piecewise LDA for Solving the Small Sample Size Problem in Face Verification , 2007, IEEE Transactions on Neural Networks.

[29]  Shimon Ullman,et al.  Cross-generalization: learning novel classes from a single example by feature replacement , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[30]  Robert P. W. Duin,et al.  Linear dimensionality reduction via a heteroscedastic extension of LDA: the Chernoff criterion , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  David Masip,et al.  An ensemble-based method for linear feature extraction for two-class problems , 2005, Pattern Analysis and Applications.

[32]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[33]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[34]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[35]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[36]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[37]  Jie Wang,et al.  On solving the face recognition problem with one training sample per subject , 2006, Pattern Recognit..

[38]  Roberto Brunelli,et al.  Face Recognition: Features Versus Templates , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Antonio Torralba,et al.  Sharing features: efficient boosting procedures for multiclass object detection , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[40]  M. Bressan,et al.  Nonparametric discriminant analysis and nearest neighbor classification , 2003, Pattern Recognit. Lett..

[41]  Jonathan Baxter,et al.  A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[42]  Stan Z. Li,et al.  Learning spatially localized, parts-based representation , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[43]  K. Fukunaga,et al.  Nonparametric Discriminant Analysis , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Hyeonjoon Moon,et al.  The FERET verification testing protocol for face recognition algorithms , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[45]  Konstantinos N. Plataniotis,et al.  Ensemble-based discriminant learning with boosting for face recognition , 2006, IEEE Transactions on Neural Networks.

[46]  Aleix M. Martinez,et al.  The AR face database , 1998 .

[47]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[48]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[49]  Shimon Ullman,et al.  Single-example Learning of Novel Classes using Representation by Similarity , 2005, BMVC.

[50]  Zhang Yi,et al.  Determination of the Number of Principal Directions in a Biologically Plausible PCA Model , 2007, IEEE Transactions on Neural Networks.

[51]  Witold Pedrycz,et al.  Face Recognition Using an Enhanced Independent Component Analysis Approach , 2007, IEEE Transactions on Neural Networks.

[52]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[53]  T. Poggio,et al.  The Mathematics of Learning: Dealing with Data , 2005, 2005 International Conference on Neural Networks and Brain.

[54]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[55]  Yiming Yang,et al.  Learning Multiple Related Tasks using Latent Independent Component Analysis , 2005, NIPS.

[56]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[57]  C A Nelson,et al.  Learning to Learn , 2017, Encyclopedia of Machine Learning and Data Mining.

[58]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..