Discriminative Metric Learning by Neighborhood Gerrymandering

We formulate the problem of metric learning for k nearest neighbor classification as a large margin structured prediction problem, with a latent variable representing the choice of neighbors and the task loss directly corresponding to classification error. We describe an efficient algorithm for exact loss augmented inference, and a fast gradient descent algorithm for learning in this model. The objective drives the metric to establish neighborhood boundaries that benefit the true class labels for the training points. Our approach, reminiscent of gerrymandering (redrawing of political boundaries to provide advantage to certain parties), is more direct in its handling of optimizing classification accuracy than those previously proposed. In experiments on a variety of data sets our method is shown to achieve excellent results compared to current state of the art in metric learning.

[1]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[2]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[3]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[4]  Alan L. Yuille,et al.  The Concave-Convex Procedure (CCCP) , 2001, NIPS.

[5]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[6]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[7]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[8]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[9]  Amir Globerson,et al.  Metric Learning by Collapsing Classes , 2005, NIPS.

[10]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[11]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[12]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[13]  P. Stange On the efficient update of the Singular Value Decomposition , 2008 .

[14]  Kilian Q. Weinberger,et al.  Fast solvers and efficient implementations for distance metric learning , 2008, ICML '08.

[15]  Thorsten Joachims,et al.  Learning structural SVMs with latent variables , 2009, ICML '09.

[16]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[17]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Shih-Fu Chang,et al.  Sequential Projection Learning for Hashing with Compact Codes , 2010, ICML.

[19]  Gert R. G. Lanckriet,et al.  Metric Learning to Rank , 2010, ICML.

[20]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[21]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[22]  David J. Fleet,et al.  Hamming Distance Metric Learning , 2012, NIPS.

[23]  Jun Wang,et al.  Learning Neighborhoods for Metric Learning , 2012, ECML/PKDD.

[24]  Stephen Tyree,et al.  Non-linear Metric Learning , 2012, NIPS.

[25]  Richard S. Zemel,et al.  Stochastic k-Neighborhood Selection for Supervised and Unsupervised Learning , 2013, ICML.