Large-Margin Metric Learning for Partitioning Problems

In this paper, we consider unsupervised partitioning problems, such as clustering, image segmentation, video segmentation and other change-point detection problems. We focus on partitioning problems based explicitly or implicitly on the minimization of Euclidean distortions, which include mean-based change-point detection, K-means, spectral clustering and normalized cuts. Our main goal is to learn a Mahalanobis metric for these unsupervised problems, leading to feature weighting and/or selection. This is done in a supervised way by assuming the availability of several potentially partially labelled datasets that share the same metric. We cast the metric learning problem as a large-margin structured prediction problem, with proper definition of regularizers and losses, leading to a convex optimization problem which can be solved efficiently with iterative techniques. We provide experiments where we show how learning the metric may significantly improve the partitioning performance in synthetic examples, bioinformatics, video segmentation and image segmentation problems.

[1]  Inderjit S. Dhillon,et al.  Metric and Kernel Learning Using a Linear Transformation , 2009, J. Mach. Learn. Res..

[2]  Guillem Rigaill,et al.  Pruned dynamic programming for optimal multiple change-point detection , 2010 .

[3]  Arjun K. Gupta,et al.  Parametric Statistical Change Point Analysis , 2000 .

[4]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[5]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[6]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[7]  Derek Hoiem,et al.  Learning CRFs Using Graph Cuts , 2008, ECCV.

[8]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[9]  L. Hubert,et al.  Comparing partitions , 1985 .

[10]  Jean Ponce,et al.  Discriminative clustering for image co-segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[12]  Tomer Hertz,et al.  Learning a Mahalanobis Metric from Equivalence Constraints , 2005, J. Mach. Learn. Res..

[13]  J. Gower,et al.  Minimum Spanning Trees and Single Linkage Cluster Analysis , 1969 .

[14]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[15]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[16]  H. Akaike A new look at the statistical model identification , 1974 .

[17]  Gert R. G. Lanckriet,et al.  Metric Learning to Rank , 2010, ICML.

[18]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Max Welling,et al.  Robust Higher Order Statistics , 2005, AISTATS.

[20]  Thomas Hofmann,et al.  Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[21]  V. Liebscher,et al.  Consistencies and rates of convergence of jump-penalized least squares estimators , 2009, 0902.4838.

[22]  Takeo Kanade,et al.  Discriminative cluster analysis , 2006, ICML.

[23]  Alexander J. Smola,et al.  Bundle Methods for Regularized Risk Minimization , 2010, J. Mach. Learn. Res..

[24]  Michael I. Jordan,et al.  Learning Spectral Clustering , 2003, NIPS.

[25]  W. Marsden I and J , 2012 .

[26]  Shimon Ullman,et al.  Learning to Segment , 2004, ECCV.

[27]  Manuel Davy,et al.  An online kernel change detection algorithm , 2005, IEEE Transactions on Signal Processing.

[28]  Marc Lavielle,et al.  Using penalized contrasts for the change-point problem , 2005, Signal Process..

[29]  Francis R. Bach,et al.  Learning smoothing models of copy number profiles using breakpoint annotations , 2013, BMC Bioinformatics.

[30]  Cordelia Schmid,et al.  Learning Object Representations for Visual Object Class Recognition , 2007, ICCV 2007.

[31]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[32]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.