Sublinear Optimization for Machine Learning

We give sub linear-time approximation algorithms for some optimization problems arising in machine learning, such as training linear classifiers and finding minimum enclosing balls. Our algorithms can be extended to some kernelized versions of these problems, such as SVDD, hard margin SVM, and $L_2$-SVM, for which sub linear-time algorithms were not known before. These new algorithms use a combination of a novel sampling techniques and a new multiplicative update algorithm. We give lower bounds which show the running times of many of our algorithms to be nearly best possible in the unit-cost RAM model. We also give implementations of our algorithms in the semi-streaming setting, obtaining the first low pass polylogarithmic space and sub linear time algorithms achieving arbitrary approximation factor.

[1]  David P. Woodruff,et al.  1-pass relative-error Lp-sampling with applications , 2010, SODA '10.

[2]  Pankaj K. Agarwal,et al.  Streaming Algorithms for Extent Problems in High Dimensions , 2010, SODA '10.

[3]  Bernhard Schölkopf,et al.  A Short Introduction to Learning with Kernels , 2002, Machine Learning Summer School.

[4]  Kenneth L. Clarkson,et al.  Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm , 2008, SODA '08.

[5]  Alan M. Frieze,et al.  A Polynomial-Time Algorithm for Learning Noisy Linear Threshold Functions , 1996, Algorithmica.

[6]  S. V. N. Vishwanathan,et al.  Efficient Approximation Algorithms for Minimum Enclosing Convex Shapes , 2009, ArXiv.

[7]  Claudio Gentile,et al.  On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.

[8]  S. Muthukrishnan,et al.  Data streams: algorithms and applications , 2005, SODA '03.

[9]  Éva Tardos,et al.  Fast approximation algorithms for fractional packing and covering problems , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[10]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[11]  Santosh S. Vempala,et al.  A simple polynomial-time rescaling algorithm for solving linear programs , 2004, STOC '04.

[12]  Leonid Khachiyan,et al.  A sublinear-time randomized approximation algorithm for matrix games , 1995, Oper. Res. Lett..

[13]  Rocco A. Servedio,et al.  On PAC learning using Winnow, Perceptron, and a Perceptron-like algorithm , 1999, COLT '99.

[14]  Joan Feigenbaum,et al.  Graph Distances in the Data-Stream Model , 2008, SIAM J. Comput..

[15]  Timothy M. Chan,et al.  A Simple Streaming Algorithm for Minimum Enclosing Balls , 2006, CCCG.

[16]  Oded Regev,et al.  Simulating Quantum Correlations with Finite Communication , 2007, FOCS.

[17]  Tom Bylander,et al.  Learning linear threshold functions in the presence of classification noise , 1994, COLT '94.