Subgradient and sampling algorithms for l1 regression

Given an <i>n</i> × <i>d</i> matrix <i>A</i> and an <i>n</i>-vector <i>b</i>, the <i>l</i><inf>1</inf> <i>regression</i> problem is to find the vector <i>x</i> minimizing the objective function ||<i>Ax</i> - <i>b</i>||<inf>1</inf>, where ||<i>y</i>||<inf>1</inf> ≡ Σ<inf>i</inf>|<i>y</i><inf>i</inf>| for vector <i>y.</i> This paper gives an algorithm needing <i>O</i>(<i>n</i> log <i>n</i>)<i>d</i><i><sup>O</sup></i><sup>(1)</sup> time in the worst case to obtain an approximate solution, with objective function value within a fixed ratio of optimum. Given ∈ > 0, a solution whose value is within 1 + ≡ of optimum can be obtained either by a deterministic algorithm using an additional <i>O</i>(<i>n</i>)(<i>d</i>/∈)<i><sup>o</sup></i><sup>(1)</sup>) time, or by a Monte Carlo algorithm using an additional <i>O</i>((<i>d</i>/∈)<i><sup>O</sup></i><sup>(1)</sup>) time. The analysis of the randomized algorithm shows that weighted coresets exist for <i>l</i><inf>1</inf> regression. The algorithms use the ellipsoid method, gradient descent, and random sampling.

[1]  A E Bostwick,et al.  THE THEORY OF PROBABILITIES. , 1896, Science.

[2]  Naum Zuselevich Shor,et al.  Minimization Methods for Non-Differentiable Functions , 1985, Springer Series in Computational Mathematics.

[3]  László Lovász,et al.  Algorithmic theory of numbers, graphs and convexity , 1986, CBMS-NSF regional conference series in applied mathematics.

[4]  Hiroshi Imai,et al.  Algorithms for vertical and orthogonal L1 linear approximation of points , 1988, SCG '88.

[5]  Nimrod Megiddo,et al.  Linear time algorithms for some separable quadratic programming problems , 1993, Oper. Res. Lett..

[6]  Kenneth L. Clarkson,et al.  Las Vegas algorithms for linear and integer programming when the dimension is small , 1995, JACM.

[7]  Alexander Schrijver,et al.  Theory of linear and integer programming , 1986, Wiley-Interscience series in discrete mathematics and optimization.

[8]  Dimitri P. Bertsekas,et al.  Incremental Subgradient Methods for Nondifferentiable Optimization , 2001, SIAM J. Optim..

[9]  Petros Drineas,et al.  Fast Monte-Carlo algorithms for approximate matrix multiplication , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[10]  Piotr Indyk,et al.  Approximate clustering via core-sets , 2002, STOC '02.

[11]  Sariel Har-Peled,et al.  Projective clustering in high dimensions using core-sets , 2002, SCG '02.

[12]  Andreas Maurer A bound on the deviation probability for sums of non-negative random variables. , 2003 .

[13]  Luis E. Ortiz,et al.  Concentration Inequalities for the Missing Mass and for Histogram Rule Error , 2003, J. Mach. Learn. Res..

[14]  Pankaj K. Agarwal,et al.  Approximating extent measures of points , 2004, JACM.

[15]  A. Banerjee Convex Analysis and Optimization , 2006 .

[16]  Petros Drineas,et al.  Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication , 2006, SIAM J. Comput..