Large-scale distance metric learning for k-nearest neighbors regression

This paper presents a distance metric learning method for k-nearest neighbors regression. We define the constraints based on triplets, which are built from the neighborhood of each training instance, to learn the distance metric. The resulting optimization problem can be formulated as a convex quadratic program. Quadratic programming has a disadvantage that it does not scale well in large-scale settings. To reduce the time complexity of training, we propose a novel dual coordinate descent method for this type of problem. Experimental results on several regression data sets show that our method obtains a competitive performance when compared with the state-of-the-art distance metric learning methods, while being an order of magnitude faster.

[1]  Chih-Jen Lin,et al.  Newton's Method for Large Bound-Constrained Optimization Problems , 1999, SIAM J. Optim..

[2]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Kernel Machines , 2012, ArXiv.

[3]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[4]  Carl E. Rasmussen,et al.  In Advances in Neural Information Processing Systems , 2011 .

[5]  M. Friedman A Comparison of Alternative Tests of Significance for the Problem of $m$ Rankings , 1940 .

[6]  Misha Pavel,et al.  Adjustment Learning and Relevant Component Analysis , 2002, ECCV.

[7]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[8]  Chih-Jen Lin,et al.  A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[9]  Brian Kulis,et al.  Metric Learning: A Survey , 2013, Found. Trends Mach. Learn..

[10]  Chi Fang,et al.  Fisher's linear discriminant embedded metric learning , 2014, Neurocomputing.

[11]  Rory A. Fisher,et al.  Statistical methods and scientific inference. , 1957 .

[12]  R. A. Fisher,et al.  Statistical methods and scientific inference. , 1957 .

[13]  Shai Shalev-Shwartz,et al.  Stochastic dual coordinate ascent methods for regularized loss , 2012, J. Mach. Learn. Res..

[14]  Yiming Ying,et al.  Guaranteed Classification via Regularized Similarity Learning , 2013, Neural Computation.

[15]  Yuan Shi,et al.  Sparse Compositional Metric Learning , 2014, AAAI.

[16]  Prateek Jain,et al.  Fast image search for learned metrics , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Gang Hua,et al.  Discriminant Embedding for Local Image Descriptors , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[18]  Tatsuya Akutsu,et al.  Optimizing amino acid substitution matrices with a local alignment kernel , 2006, BMC Bioinformatics.

[19]  Peng Li,et al.  Distance Metric Learning with Eigenvalue Optimization , 2012, J. Mach. Learn. Res..

[20]  Horst Bischof,et al.  Large scale metric learning from equivalence constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[22]  Pong C. Yuen,et al.  Semi-supervised metric learning via topology preserving multiple semi-supervised assumptions , 2013, Pattern Recognit..

[23]  Ravinder Singh,et al.  Fast-Find: A novel computational approach to analyzing combinatorial motifs , 2006, BMC Bioinformatics.

[24]  Huilin Xiong,et al.  Kernel-based distance metric learning for microarray data classification , 2006, BMC Bioinformatics.

[25]  María José del Jesús,et al.  KEEL: a software tool to assess evolutionary algorithms for data mining problems , 2008, Soft Comput..

[26]  Meng Wang,et al.  Semi-supervised distance metric learning based on local linear regression for data clustering , 2012, Neurocomputing.

[27]  Stephen M. Omohundro,et al.  Five Balltree Construction Algorithms , 2009 .

[28]  Mahdieh Soleymani Baghshah,et al.  Kernel-based metric learning for semi-supervised clustering , 2010, Neurocomputing.

[29]  Chih-Jen Lin,et al.  Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines , 2008, J. Mach. Learn. Res..

[30]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[31]  Farida Cheriet,et al.  Modified Large Margin Nearest Neighbor Metric Learning for Regression , 2014, IEEE Signal Processing Letters.

[32]  S. García,et al.  An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons , 2008 .

[33]  Yang Li,et al.  Risk-based adaptive metric learning for nearest neighbour classification , 2015, Neurocomputing.

[34]  Cordelia Schmid,et al.  Is that you? Metric learning approaches for face identification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[35]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[36]  Nan Jiang,et al.  Individual adaptive metric learning for visual tracking , 2016, Neurocomputing.

[37]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[38]  Jianyu Yang,et al.  Metric learning based object recognition and retrieval , 2016, Neurocomputing.

[39]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[40]  O. J. Dunn Multiple Comparisons among Means , 1961 .

[41]  Shiliang Sun,et al.  Kernel regression with sparse metric learning , 2013, J. Intell. Fuzzy Syst..

[42]  R. Tibshirani,et al.  PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[43]  Dacheng Tao,et al.  Local discriminative distance metrics ensemble learning , 2013, Pattern Recognit..

[44]  Thorsten Joachims,et al.  Learning a Distance Metric from Relative Comparisons , 2003, NIPS.

[45]  Haiyan Chen,et al.  Bagging-like metric learning for support vector regression , 2014, Knowl. Based Syst..

[46]  I. Jolliffe Principal Component Analysis , 2002 .

[47]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[48]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[49]  Lorenzo Torresani,et al.  Large Margin Component Analysis , 2006, NIPS.

[50]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[51]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[52]  Lei Wang,et al.  Positive Semidefinite Metric Learning Using Boosting-like Algorithms , 2011, J. Mach. Learn. Res..

[53]  P. Tseng,et al.  On the convergence of the coordinate descent method for convex differentiable minimization , 1992 .

[54]  Wei Wang,et al.  Globality and locality incorporation in distance metric learning , 2014, Neurocomputing.

[55]  Yoram Singer,et al.  Online and batch learning of pseudo-metrics , 2004, ICML.

[56]  Kilian Q. Weinberger,et al.  Metric Learning for Kernel Regression , 2007, AISTATS.

[57]  Tomer Hertz,et al.  Learning a Mahalanobis Metric from Equivalence Constraints , 2005, J. Mach. Learn. Res..

[58]  Kilian Q. Weinberger,et al.  Large Margin Multi-Task Metric Learning , 2010, NIPS.

[59]  Weida Tong,et al.  Differential gene expression in mouse primary hepatocytes exposed to the peroxisome proliferator-activated receptor α agonists , 2006, BMC Bioinformatics.

[60]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..