Geometric median in nearly linear time

In this paper we provide faster algorithms for solving the geometric median problem: given n points in d compute a point that minimizes the sum of Euclidean distances to the points. This is one of the oldest non-trivial problems in computational geometry yet despite a long history of research the previous fastest running times for computing a (1+є)-approximate geometric median were O(d· n4/3є−8/3) by Chin et. al, Õ(dexpє−4logє−1) by Badoiu et. al, O(nd+poly(d,є−1)) by Feldman and Langberg, and the polynomial running time of O((nd)O(1)log1/є) by Parrilo and Sturmfels and Xue and Ye. In this paper we show how to compute such an approximate geometric median in time O(ndlog3n/є) and O(dє−2). While our O(dє−2) is a fairly straightforward application of stochastic subgradient descent, our O(ndlog3n/є) time algorithm is a novel long step interior point method. We start with a simple O((nd)O(1)log1/є) time interior point method and show how to improve it, ultimately building an algorithm that is quite non-standard from the perspective of interior point literature. Our result is one of few cases of outperforming standard interior point theory. Furthermore, it is the only case we know of where interior point methods yield a nearly linear time algorithm for a canonical optimization problem that traditionally requires superlinear time.

[1]  J. Jewkes,et al.  Theory of Location of Industries. , 1933 .

[2]  Harold W. Kuhn,et al.  A note on Fermat's problem , 1973, Math. Program..

[3]  Lawrence M. Ostresh On the Convergence of a Class of Iterative Methods for Solving the Weber Location Problem , 1978, Oper. Res..

[4]  Richard A. Kronmal,et al.  The alias and alias-rejection-mixture methods for generating random variables from probability distributions , 1979, WSC '79.

[5]  L. Cooper,et al.  The Weber problem revisited , 1981 .

[6]  E. Balas,et al.  A Note on the Weiszfeld-Kuhn Algorithm for the General Fermat Problem. , 1982 .

[7]  Chandrajit L. Bajaj,et al.  The algebraic degree of geometric optimization problems , 1988, Discret. Comput. Geom..

[8]  James Renegar,et al.  A polynomial-time algorithm, based on Newton's method, for linear programming , 1988, Math. Program..

[9]  Arie Tamir,et al.  Open questions concerning Weiszfeld's algorithm for the Fermat-Weber location problem , 1989, Math. Program..

[10]  M. Shirosaki Another proof of the defect relation for moving targets , 1991 .

[11]  P. Rousseeuw,et al.  Breakdown Points of Affine Equivariant Estimators of Multivariate Location and Covariance Matrices , 1991 .

[12]  Clóvis C. Gonzaga,et al.  Path-Following Methods for Linear Programming , 1992, SIAM Rev..

[13]  Yurii Nesterov,et al.  Interior-point polynomial algorithms in convex programming , 1994, Siam studies in applied mathematics.

[14]  Jakob Krarup,et al.  On Torricelli's geometrical solution to a problem of Fermat , 1997 .

[15]  Yinyu Ye,et al.  An Efficient Algorithm for Minimizing a Sum of Euclidean Norms with Applications , 1997, SIAM J. Optim..

[16]  Yinyu Ye,et al.  Interior point algorithms: theory and analysis , 1997 .

[17]  R. Motwani,et al.  High-Dimensional Computational Geometry , 2000 .

[18]  Cun-Hui Zhang,et al.  The multivariate L1-median and associated data depth. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Pablo A. Parrilo,et al.  Minimizing Polynomial Functions , 2001, Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathematics and Computer Science.

[20]  Piotr Indyk,et al.  Approximate clustering via core-sets , 2002, STOC '02.

[21]  Zvi Drezner,et al.  The Weber Problem , 2002 .

[22]  P. Bose,et al.  Fast approximations for sums of distances, clustering and the Fermat-Weber problem , 2003, Comput. Geom..

[23]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[24]  Sariel Har-Peled,et al.  Smaller Coresets for k-Median and k-Means Clustering , 2005, SCG.

[25]  F. Plastria,et al.  On the convergence of the Weiszfeld algorithm for continuous single facility location–allocation problems , 2008 .

[26]  Michael Langberg,et al.  A unified framework for approximating and clustering data , 2011, STOC '11.

[27]  Gary L. Miller,et al.  Runtime guarantees for regression problems , 2011, ITCS '13.

[28]  Aleksander Madry,et al.  Navigating Central Path with Electrical Flows: From Flows to Matchings, and Back , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[29]  Sébastien Bubeck,et al.  Theory of Convex Optimization for Machine Learning , 2014, ArXiv.

[30]  Yin Tat Lee,et al.  Path Finding Methods for Linear Programming: Solving Linear Programs in Õ(vrank) Iterations and Faster Algorithms for Maximum Flow , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.

[31]  Sébastien Bubeck,et al.  Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[32]  Alexandr Andoni,et al.  High-Dimensional Computational Geometry , 2016, Handbook of Big Data.