Bagging-like metric learning for support vector regression

Metric plays an important role in machine learning and pattern recognition. Though many available off-the-shelf metrics can be selected to achieve some learning tasks at hand such as for k-nearest neighbor classification and k-means clustering, such a selection is not necessarily always appropriate due to its independence on data itself. It has been proved that a task-dependent metric learned from the given data can yield more beneficial learning performance. Inspired by such success, we focus on learning an embedded metric specially for support vector regression and present a corresponding learning algorithm termed as SVRML, which both minimizes the error on the validation dataset and simultaneously enforces the sparsity on the learned metric matrix. Further taking the learned metric (positive semi-definite matrix) as a base learner, we develop a bagging-like effective ensemble metric learning framework in which the resampling mechanism of original bagging is specially modified for SVRML. Experiments on various datasets demonstrate that our method outperforms the single and bagging-based ensemble metric learnings for support vector regression.

[1]  Robert E. Schapire,et al.  Theoretical Views of Boosting and Applications , 1999, ALT.

[2]  Kilian Q. Weinberger,et al.  Metric Learning for Kernel Regression , 2007, AISTATS.

[3]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[4]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[5]  Lei Wang,et al.  Positive Semidefinite Metric Learning Using Boosting-like Algorithms , 2011, J. Mach. Learn. Res..

[6]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Kernel Machines , 2012, ArXiv.

[7]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[8]  Stephen P. Boyd,et al.  Semidefinite Programming , 1996, SIAM Rev..

[9]  Rong Jin,et al.  Distance Metric Learning: A Comprehensive Survey , 2006 .

[10]  Stephen Tyree,et al.  Non-linear Metric Learning , 2012, NIPS.

[11]  Kilian Q. Weinberger,et al.  Large Margin Multi-Task Metric Learning , 2010, NIPS.

[12]  Ming-Wei Chang,et al.  Leave-One-Out Bounds for Support Vector Regression Model Selection , 2005, Neural Computation.

[13]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[14]  Xi Chen,et al.  Direct Discriminant Locality Preserving Projection With Hammerstein Polynomial Expansion , 2012, IEEE Transactions on Image Processing.

[15]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[16]  Shiliang Sun,et al.  Kernel regression with sparse metric learning , 2013, J. Intell. Fuzzy Syst..

[17]  Dit-Yan Yeung,et al.  Worst-Case Linear Discriminant Analysis , 2010, NIPS.

[18]  Marc Sebban,et al.  A Survey on Metric Learning for Feature Vectors and Structured Data , 2013, ArXiv.

[19]  Kaizhu Huang,et al.  Generalized sparse metric learning with relative comparisons , 2011, Knowledge and Information Systems.

[20]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[21]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[22]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[23]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[24]  Stuart J. Russell,et al.  Online bagging and boosting , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[25]  Xi Chen,et al.  Maximum Variance Difference Based Embedding Approach for Facial Feature Extraction , 2010, Int. J. Pattern Recognit. Artif. Intell..

[26]  Kaare Brandt Petersen,et al.  The Matrix Cookbook , 2006 .

[27]  Misha Pavel,et al.  Adjustment Learning and Relevant Component Analysis , 2002, ECCV.

[28]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.

[29]  Chin-Chun Chang,et al.  A boosting approach for supervised Mahalanobis distance metric learning , 2012, Pattern Recognit..

[30]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[31]  Dacheng Tao,et al.  Local discriminative distance metrics ensemble learning , 2013, Pattern Recognit..

[32]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.