A geometric theory of outliers and perturbation

We develop a new understanding of outliers and the behavior of linear programs under perturbation. Outliers are ubiquitous in scientific theory and practice. We analyze a simple algorithm for removal of outliers from a high-dimensional data set and show the algorithm to be asymptotically good. We extend this result to distributions that we can access only by sampling, and also to the optimization version of the problem. Our results cover both the discrete and continuous cases. This is joint work with Santosh Vempala. The complexity of solving linear programs has interested researchers for half a century now. We show that an arbitrary linear program subject to a small random relative perturbation has good condition number with high probability, and hence is easy to solve. This is joint work with Avrim Blum, Daniel Spielman, and Shang-Hua Teng. This result forms part of the smoothed analysis project initiated by Spielman and Teng to better explain mathematically the observed performance of algorithms. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

[1]  Shang-Hua Teng,et al.  Smoothed analysis of algorithms: why the simplex algorithm usually takes polynomial time , 2001, STOC '01.

[2]  David Eppstein,et al.  Approximating center points with iterated radon points , 1993, SCG '93.

[3]  T. Koopmans,et al.  Activity Analysis of Production and Allocation. , 1952 .

[4]  Katta G. Murty,et al.  Computational complexity of parametric linear programming , 1980, Math. Program..

[5]  M. Simonovits,et al.  Random walks and an O * ( n 5 ) volume algorithm for convex bodies , 1997 .

[6]  Santosh S. Vempala,et al.  Optimal outlier removal in high-dimensional spaces , 2004, J. Comput. Syst. Sci..

[7]  Robert M. Freund,et al.  Opera Tions Research Center Working Paper Condition-measure Bounds on the Behavior of the Central Trajectory of a Semi-definite Program , 2022 .

[8]  D. Donoho,et al.  Breakdown Properties of Location Estimates Based on Halfspace Depth and Projected Outlyingness , 1992 .

[9]  V. V. Buldygin,et al.  Brunn-Minkowski inequality , 2000 .

[10]  K. Borgwardt The Simplex Method: A Probabilistic Analysis , 1986 .

[11]  Felipe Cucker,et al.  A Primal-Dual Algorithm for Solving Polyhedral Conic Systems with a Finite-Precision Machine , 2002, SIAM J. Optim..

[12]  Robert M. Freund,et al.  Condition-Based Complexity of Convex Optimization in Conic Linear Form via the Ellipsoid Algorithm , 1999, SIAM J. Optim..

[13]  Michael J. Todd,et al.  Polynomial expected behavior of a pivoting algorithm for linear complementarity and linear programming problems , 1986, Math. Program..

[14]  R. Freund,et al.  Condition number complexity of an elementary algorithm for resolving a conic linear system , 1997 .

[15]  Robert M. Freund,et al.  Interior point methods : current status and future directions , 1996 .

[16]  Keith Ball The reverse isoperimetric problem for Gaussian measure , 1993, Discret. Comput. Geom..

[17]  D. Spielman,et al.  Smoothed Analysis of Renegar’s Condition Number for Linear Programming , 2002 .

[18]  James Renegar,et al.  Linear programming, complexity theory and elementary functional analysis , 1995, Math. Program..

[19]  Narendra Karmarkar,et al.  A new polynomial-time algorithm for linear programming , 1984, Comb..

[20]  J. Renegar Some perturbation theory for linear programming , 1994, Math. Program..

[21]  Anupam Gupta,et al.  An elementary proof of the Johnson-Lindenstrauss Lemma , 1999 .

[22]  Miklós Simonovits,et al.  Random walks and an O*(n5) volume algorithm for convex bodies , 1997, Random Struct. Algorithms.

[23]  Daniel Bienstock,et al.  Potential Function Methods for Approximately Solving Linear Programming Problems: Theory and Practice , 2002 .

[24]  Miklós Simonovits,et al.  Isoperimetric problems for convex bodies and a localization lemma , 1995, Discret. Comput. Geom..

[25]  Steve Smale,et al.  The Problem of the Average Speed of the Simplex Method , 1982, ISMP.

[26]  Nimrod Megiddo,et al.  Improved asymptotic analysis of the average number of steps performed by the self-dual simplex algorithm , 1986, Math. Program..

[27]  Michael J. Todd,et al.  Probabilistic Models for Linear Programming , 1991, Math. Oper. Res..

[28]  John Dunagan,et al.  Smoothed analysis of the perceptron algorithm for linear programming , 2002, SODA '02.

[29]  Jorge R. Vera Ill-Posedness and the Complexity of Deciding Existence of Solutions to Linear Programs , 1996, SIAM J. Optim..

[30]  Robert M. Freund,et al.  Condition number complexity of an elementary algorithm for computing a reliable solution of a conic linear system , 2000, Math. Program..

[31]  Stephen Smale,et al.  On the average number of steps of the simplex method of linear programming , 1983, Math. Program..

[32]  Robert M. Freund,et al.  On the Complexity of Computing Estimates of Condition Measures of a Conic Linear System , 2003, Math. Oper. Res..

[33]  V. Klee,et al.  HOW GOOD IS THE SIMPLEX ALGORITHM , 1970 .

[34]  James Renegar,et al.  Incorporating Condition Measures into the Complexity Theory of Linear Programming , 1995, SIAM J. Optim..

[35]  Nimrod Megiddo,et al.  A simplex algorithm whose average number of steps is bounded between two quadratic functions of the smaller dimension , 1985, JACM.

[36]  Victor J. Yohai,et al.  The Behavior of the Stahel-Donoho Robust Multivariate Estimator , 1995 .

[37]  I. J. Schoenberg,et al.  The Relaxation Method for Linear Inequalities , 1954, Canadian Journal of Mathematics.

[38]  R. Rao,et al.  Normal Approximation and Asymptotic Expansions , 1976 .

[39]  Emily Cargan Vera , 1996 .

[40]  L. G. H. Cijan A polynomial algorithm in linear programming , 1979 .

[41]  Robert M. Freund,et al.  A New Condition Measure, Preconditioners, and Relations Between Different Measures of Conditioning for Conic Linear Systems , 2002, SIAM J. Optim..

[42]  V. Klee,et al.  Helly's theorem and its relatives , 1963 .

[43]  Marvin Minsky,et al.  Perceptrons: An Introduction to Computational Geometry , 1969 .

[44]  Richard M. Karp,et al.  A simplex variant solving an m times d linear program in O(min(m2, d2) expected number of pivot steps , 1987, J. Complex..

[45]  L. Khachiyan Polynomial algorithms in linear programming , 1980 .

[46]  Alan M. Frieze,et al.  A Polynomial-Time Algorithm for Learning Noisy Linear Threshold Functions , 1996, Algorithmica.