A decoupled approach to exemplar-based unsupervised learning

A recent trend in exemplar based unsupervised learning is to formulate the learning problem as a convex optimization problem. Convexity is achieved by restricting the set of possible prototypes to training exemplars. In particular, this has been done for clustering, vector quantization and mixture model density estimation. In this paper we propose a novel algorithm that is theoretically and practically superior to these convex formulations. This is possible by posing the unsupervised learning problem as a single convex "master problem" with non-convex subproblems. We show that for the above learning tasks the subproblems are extremely well-behaved and can be solved efficiently.

[1]  Polina Golland,et al.  Convex Clustering with Exemplar-Based Models , 2007, NIPS.

[2]  Saharon Rosset,et al.  Boosting Density Estimation , 2002, NIPS.

[3]  Miguel Á. Carreira-Perpiñán,et al.  Mode-Finding for Mixtures of Gaussian Distributions , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Lorenz T. Biegler,et al.  On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming , 2006, Math. Program..

[5]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[6]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[8]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[10]  Christodoulos A. Floudas Generalized Benders Decomposition , 2009, Encyclopedia of Optimization.

[11]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[12]  Miguel A. Carreira-Perpi Mode-nding for mixtures of Gaussian distributions , 2000 .

[13]  Anton van den Hengel,et al.  Fast Global Kernel Density Mode Seeking: Applications to Localization and Tracking , 2007, IEEE Transactions on Image Processing.

[14]  Ulrich Rückert,et al.  A statistical approach to rule learning , 2006, ICML.

[15]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[16]  Christopher K. I. Williams,et al.  An isotropic Gaussian mixture can have more modes than components , 2003 .

[18]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[19]  Nicolas Le Roux,et al.  Convex Neural Networks , 2005, NIPS.

[20]  Bernhard Schölkopf,et al.  A Kernel Approach for Vector Quantization with Guaranteed Distortion Bounds , 2001, AISTATS.