Non-linear neighborhood component analysis based on constructive neural networks

In this paper, we propose a novel non-linear supervised metric learning algorithm. The algorithm combines the neighborhood component analysis method with constructive neural networks which gradually increase the network size during the training process. The network aims to maximize a stochastic variant of the leave-one-out K-nearest neighbor (KNN) score on the training set. In this way, the proposed algorithm learns a nonlinear metric for KNN classification, overcoming the limitations of traditional metric learning algorithms which are only capable of learning linear transformations. Therefore, the proposed method is more flexible and powerful in transforming data than its linear counterpart. Moreover, it can also learn a low-dimensional non-linear mapping for visualization and fast classification. We validate our method on several benchmark datasets both for metric learning and dimensionality reduction, and the results demonstrate the competitiveness of the proposed approach.

[1]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[2]  Amir Globerson,et al.  Metric Learning by Collapsing Classes , 2005, NIPS.

[3]  Misha Pavel,et al.  Adjustment Learning and Relevant Component Analysis , 2002, ECCV.

[4]  Stephen Tyree,et al.  Non-linear Metric Learning , 2012, NIPS.

[5]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[6]  Lawrence K. Saul,et al.  Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..

[7]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[8]  James T. Kwok,et al.  Constructive algorithms for structure learning in feedforward neural networks for regression problems , 1997, IEEE Trans. Neural Networks.

[9]  Trevor F. Cox,et al.  Metric multidimensional scaling , 2000 .

[10]  Guy Lebanon,et al.  Metric learning for text documents , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[12]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[13]  Lorenzo Torresani,et al.  Large Margin Component Analysis , 2006, NIPS.

[14]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[15]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[16]  Inderjit S. Dhillon,et al.  Metric and Kernel Learning Using a Linear Transformation , 2009, J. Mach. Learn. Res..

[17]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[18]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[19]  Cheng Wu,et al.  Orthogonal Least Squares Algorithm for Training Cascade Neural Networks , 2012, IEEE Transactions on Circuits and Systems I: Regular Papers.

[20]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).