Self-improving algorithms

We investigate ways in which an algorithm can improve its expected performance by fine-tuning itself automatically with respect to an arbitrary, unknown input distribution. We give such self-improving algorithms for sorting and clustering. The highlights of this work: (i) a sorting algorithm with optimal expected limiting running time; and (ii) a k-median algorithm over the Hamming cube with linear expected limiting running time. In all cases, the algorithm begins with a learning phase during which it adjusts itself to the input distribution (typically in a logarithmic number of rounds), followed by a stationary regime in which the algorithm settles to its optimized incarnation.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Kurt Mehlhorn,et al.  Data Structures and Algorithms 1: Sorting and Searching , 2011, EATCS Monographs on Theoretical Computer Science.

[3]  Daniel S. Hirschberg,et al.  Self-organizing linear search , 1985, CSUR.

[4]  Jirí Matousek,et al.  How to net a lot with little: small ε-nets for disks and halfspaces , 1990, SCG '90.

[5]  Adam Tauman Kalai,et al.  Online convex optimization in the bandit setting: gradient descent without a gradient , 2004, SODA '05.

[6]  Noga Alon,et al.  On Two Segmentation Problems , 1999, J. Algorithms.

[7]  Y. Freund,et al.  Adaptive game playing using multiplicative weights , 1999 .

[8]  Yijie Han,et al.  Deterministic sorting in O(nloglogn) time and linear space , 2004, J. Algorithms.

[9]  Bernard Chazelle,et al.  The Discrepancy Method , 1998, ISAAC.

[10]  Adam Tauman Kalai,et al.  Static Optimality and Dynamic Search-Optimality in Lists and Trees , 2002, SODA '02.

[11]  Piotr Indyk,et al.  Approximate clustering via core-sets , 2002, STOC '02.

[12]  Kenneth L. Clarkson,et al.  A Randomized Algorithm for Closest-Point Queries , 1988, SIAM J. Comput..

[13]  Ronald L. Rivest,et al.  On self-organizing sequential search heuristics , 1974, CACM.

[14]  R. Schapire,et al.  Toward efficient agnostic learning , 1992, COLT '92.

[15]  Mariette Yvinec,et al.  Algorithmic geometry , 1998 .

[16]  Gaston H. Gonnet,et al.  Exegesis of Self-Organizing Linear Search , 1981, SIAM J. Comput..

[17]  Kenneth L. Clarkson,et al.  Improved Approximation Algorithms for Geometric Set Cover , 2007, Discret. Comput. Geom..

[18]  Avrim Blum,et al.  Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary , 2004, COLT.

[19]  Leonidas J. Guibas,et al.  A linear-time algorithm for computing the voronoi diagram of a convex polygon , 1989, Discret. Comput. Geom..

[20]  Ronald L. Rivest,et al.  On self-organizing sequential search heuristics , 1976, CACM.

[21]  Sunil Arya,et al.  Optimal Expected-Case Planar Point Location , 2007, SIAM J. Comput..

[22]  Marek Karpinski,et al.  Approximation schemes for clustering problems , 2003, STOC '03.

[23]  Kenneth L. Clarkson,et al.  Smaller core-sets for balls , 2003, SODA '03.

[24]  Bernard Chazelle,et al.  Splitting a Delaunay Triangulation in Linear Time , 2002, Algorithmica.

[25]  Herbert Edelsbrunner,et al.  Simulation of simplicity: a technique to cope with degenerate cases in geometric algorithms , 1988, SCG '88.

[26]  Bernard Chazelle,et al.  Markov Incremental Constructions , 2008, SCG '08.

[27]  Susanne Albers,et al.  Self-Organizing Data Structures , 1996, Online Algorithms.

[28]  M. Inaba Application of weighted Voronoi diagrams and randomization to variance-based k-clustering , 1994, SoCG 1994.

[29]  Susanne Albers,et al.  Average Case Analyses of List Update Algorithms, with Applications to Data Compression , 1996, Algorithmica.

[30]  Jon M. Kleinberg,et al.  Segmentation problems , 2004, JACM.

[31]  J. Ian Munro,et al.  Self-organizing binary search trees , 1976, 17th Annual Symposium on Foundations of Computer Science (sfcs 1976).

[32]  David G. Kirkpatrick,et al.  Efficient computation of continuous skeletons , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[33]  Rafail Ostrovsky,et al.  Polynomial time approximation schemes for geometric k-clustering , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[34]  Timothy M. Chan,et al.  Voronoi diagrams in n · 2o(√lg lg n) time , 2007, STOC '07.

[35]  John McCabe,et al.  On Serial Files with Relocatable Records , 1965 .

[36]  Yijie Han,et al.  Deterministic sorting inO(nlog logn) time and linear space , 2002, STOC 2002.

[37]  Mark de Berg,et al.  Computational geometry: algorithms and applications , 1997 .

[38]  Bernard Chazelle An optimal algorithm for intersecting three-dimensional convex polyhedra , 1989, 30th Annual Symposium on Foundations of Computer Science.

[39]  D. T. Lee,et al.  On k-Nearest Neighbor Voronoi Diagrams in the Plane , 1982, IEEE Transactions on Computers.

[40]  Bernard Chazelle,et al.  Self-customized BSP trees for collision detection , 2000, Comput. Geom..

[41]  Robert E. Tarjan,et al.  Self-adjusting binary search trees , 1985, JACM.

[42]  Sariel Har-Peled Clustering Motion , 2004, Discret. Comput. Geom..

[43]  L BentleyJon,et al.  Amortized analyses of self-organizing sequential search heuristics , 1985 .

[44]  Bernard Chazelle,et al.  Splitting a Delaunay Triangulation in Linear Time , 2001, Algorithmica.

[45]  Russ Bubley,et al.  Randomized algorithms , 1995, CSUR.

[46]  James R. Bitner,et al.  Heuristics That Dynamically Organize Data Structures , 1979, SIAM J. Comput..

[47]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[48]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[49]  Kenneth L. Clarkson,et al.  Applications of random sampling in computational geometry, II , 1988, SCG '88.

[50]  Saurabh Ray,et al.  New existence proofs ε-nets , 2008, SCG '08.

[51]  Robert E. Tarjan,et al.  Amortized efficiency of list update and paging rules , 1985, CACM.

[52]  Derick Wood,et al.  A survey of adaptive sorting algorithms , 1992, CSUR.

[53]  Mikkel Thorup,et al.  Quick k-Median, k-Center, and Facility Location for Sparse Graphs , 2001, SIAM J. Comput..

[54]  Avrim Blum,et al.  On-line Algorithms in Machine Learning , 1996, Online Algorithms.

[55]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[56]  Rafail Ostrovsky,et al.  Efficient search for approximate nearest neighbor in high dimensional spaces , 1998, STOC '98.

[57]  Allan Borodin,et al.  Online computation and competitive analysis , 1998 .

[58]  Jirí Matousek,et al.  Reporting Points in Halfspaces , 1992, Comput. Geom..

[59]  Noga Alon,et al.  Testing of Clustering , 2003, SIAM J. Discret. Math..

[60]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[61]  David Haussler,et al.  How to use expert advice , 1993, STOC.

[62]  F. Takens Detecting strange attractors in turbulence , 1981 .

[63]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[64]  Michael L. Fredman,et al.  How Good is the Information Theory Bound in Sorting? , 1976, Theor. Comput. Sci..

[65]  Pankaj K. Agarwal,et al.  Geometric Range Searching and Its Relatives , 2007 .

[66]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[67]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .