Fast solvers and efficient implementations for distance metric learning

In this paper we study how to improve nearest neighbor classification by learning a Mahalanobis distance metric. We build on a recently proposed framework for distance metric learning known as large margin nearest neighbor (LMNN) classification. Our paper makes three contributions. First, we describe a highly efficient solver for the particular instance of semidefinite programming that arises in LMNN classification; our solver can handle problems with billions of large margin constraints in a few hours. Second, we show how to reduce both training and testing times using metric ball trees; the speedups from ball trees are further magnified by learning low dimensional representations of the input space. Third, we show how to learn different Mahalanobis distance metrics in different parts of the input space. For large data sets, the use of locally adaptive distance metrics leads to even lower error rates.

[1]  Yann LeCun,et al.  Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.

[2]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[4]  B. Borchers CSDP, A C library for semidefinite programming , 1999 .

[5]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[6]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[7]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[8]  Andrew W. Moore,et al.  An Investigation of Practical Approximate Nearest Neighbor Algorithms , 2004, NIPS.

[9]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[10]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[11]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[12]  Lorenzo Torresani,et al.  Large Margin Component Analysis , 2006, NIPS.

[13]  Jitendra Malik,et al.  Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[14]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.