论文信息 - Efficiently Learning a Distance Metric for Large Margin Nearest Neighbor Classification

Efficiently Learning a Distance Metric for Large Margin Nearest Neighbor Classification

We concern the problem of learning a Mahalanobis distance metric for improving nearest neighbor classification. Our work is built upon the large margin nearest neighbor (LMNN) classification framework. Due to the semidefiniteness constraint in the optimization problem of LMNN, it is not scalable in terms of the dimensionality of the input data. The original LMNN solver partially alleviates this problem by adopting alternating projection methods instead of standard interior-point methods. Still, at each iteration, the computation complexity is at least O(D3) (D is the dimension of input data). In this work, we propose a column generation based algorithm to solve the LMNN optimization problem much more efficiently. Our algorithm is much more scalable in that at each iteration, it does not need full eigen-decomposition. Instead, we only need to find the leading eigenvalue and its corresponding eigenvector, which is of O(D2) complexity. Experiments show the efficiency and efficacy of our algorithms.

[1] Amir Globerson,et al. Metric Learning by Collapsing Classes , 2005, NIPS.

[2] Dustin Boswell,et al. Introduction to Support Vector Machines , 2002 .

[3] Lei Wang,et al. PSDBoost: Matrix-Generation Linear Programming for Positive Semidefinite Matrices Learning , 2008, NIPS.

[4] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.

[5] Nathan Srebro,et al. SVM optimization: inverse dependence on training set size , 2008, ICML '08.

[6] Kilian Q. Weinberger,et al. Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[7] Alan Hutchinson,et al. Algorithmic Learning , 1994 .

[8] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[9] Michael I. Jordan,et al. Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[10] Jacques Desrosiers,et al. Selected Topics in Column Generation , 2002, Oper. Res..

[11] Nello Cristianini,et al. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[12] Robert E. Schapire,et al. Theoretical Views of Boosting and Applications , 1999, ALT.

[13] Lei Wang,et al. Positive Semidefinite Metric Learning with Boosting , 2009, NIPS.