Algorithms for Data Depth

The concept of data depth gives a tool for multivariate data analysis. It measures the centrality of a data point with respect to a data set. Many different notions of data depth have been introduced. In this thesis we explore efficient algorithms for computing several notions of data depth, including exact and approximation algorithms for Tukey (halfspace) depth [57, 93], Oja depth [75], and majority depth [61, 90].

[1]  Sergey Bereg,et al.  Competitive Algorithms for Maintaining a Mobile Center , 2006, Mob. Networks Appl..

[2]  Nimrod Megiddo,et al.  Linear Programming in Linear Time When the Dimension Is Fixed , 1984, JACM.

[3]  Bernard Chazelle,et al.  Quasi-optimal range searching in spaces of finite VC-dimension , 1989, Discret. Comput. Geom..

[4]  C. C. Gonzaga,et al.  An Algorithm for Solving Linear Programming Problems in O(n 3 L) Operations , 1989 .

[5]  James Renegar,et al.  A polynomial-time algorithm, based on Newton's method, for linear programming , 1988, Math. Program..

[6]  Ketan Mulmuley Dehn-Sommerville relations, upper bound theorem, and levels in arrangements , 1993, SCG '93.

[7]  John W. Chinneck,et al.  Feasibility and Infeasibility in Optimization:: Algorithms and Computational Methods , 2007 .

[8]  David Bremner,et al.  Primal-dual algorithms for data depth , 2003, Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications.

[9]  Imre Bárány,et al.  A generalization of carathéodory's theorem , 1982, Discret. Math..

[10]  H. Chernoff A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[11]  Greg Aloupis,et al.  Geometric Measures of Data Depth , 2022 .

[12]  Regina Y. Liu On a Notion of Data Depth Based on Random Simplices , 1990 .

[13]  Franco P. Preparata,et al.  The Densest Hemisphere Problem , 1978, Theor. Comput. Sci..

[14]  Pat Morin,et al.  Approximating Majority Depth , 2012, CCCG.

[15]  Jirí Matousek,et al.  Discrepancy and approximations for bounded VC-dimension , 1993, Comb..

[16]  Yuval Rabani,et al.  Linear Programming , 2007, Handbook of Approximation Algorithms and Metaheuristics.

[17]  Géza Tóth,et al.  Point Sets with Many k-Sets , 2000, SCG '00.

[18]  Timothy M. Chan Remarks on k-Level Algorithms in the Plane , 1999 .

[19]  Jayaram K. Sankaran A note on resolving infeasibility in linear programs by constraint relaxation , 1993, Oper. Res. Lett..

[20]  J. L. Hodges,et al.  A Bivariate Sign Test , 1955 .

[21]  C. H. Edwards,et al.  Calculus with analytic geometry , 1994 .

[22]  Herbert Edelsbrunner,et al.  Constructing Belts in Two-Dimensional Arrangements with Applications , 1986, SIAM J. Comput..

[23]  N. Megiddo Linear-time algorithms for linear programming in R3 and related problems , 1982, FOCS 1982.

[24]  Timothy M. Chan Optimal Partition Trees , 2012, Discret. Comput. Geom..

[25]  Micha Sharir,et al.  Arrangements and Their Applications , 2000, Handbook of Computational Geometry.

[26]  Heinrich W. Guggenheimer,et al.  Applicable geometry : global and local convexity , 1977 .

[27]  Leonidas J. Guibas,et al.  Topologically sweeping an arrangement , 1986, STOC '86.

[28]  Stefan Langerman,et al.  Optimization in Arrangements , 2003, STACS.

[29]  Gilbert W. Bassett Equivariant, Monotonic, 50% Breakdown Estimators , 1991 .

[30]  Diane L. Souvaine,et al.  Computational Geometry and Statistical Depth Measures , 2004 .

[31]  H. Oja,et al.  The finite-sample breakdown point of the Oja bivariate median and of the corresponding half-samples version , 1990 .

[32]  Timothy M. Chan,et al.  Counting inversions, offline orthogonal range counting, and related problems , 2010, SODA '10.

[33]  Nimrod Megiddo,et al.  On Finding Primal- and Dual-Optimal Bases , 1991, INFORMS J. Comput..

[34]  János Pach,et al.  Combinatorial Geometry , 2012 .

[35]  Peter Rousseeuw,et al.  Computing location depth and regression depth in higher dimensions , 1998, Stat. Comput..

[36]  R. Serfling,et al.  General notions of statistical depth function , 2000 .

[37]  Yinyu Ye,et al.  Identifying an optimal basis in linear programming , 1996, Ann. Oper. Res..

[38]  Regina Y. Liu,et al.  Regression depth. Commentaries. Rejoinder , 1999 .

[39]  Regina Y. Liu,et al.  A Quality Index Based on Data Depth and Multivariate Rank Tests , 1993 .

[40]  K. Mosler "Multivariate Dispersion, Central Regions, and Depth": The Lift Zonoid Approach , 2002 .

[41]  V. Barnett The Ordering of Multivariate Data , 1976 .

[42]  Raimund Seidel,et al.  Small-dimensional linear programming and convex hulls made easy , 1991, Discret. Comput. Geom..

[43]  Pat Morin,et al.  Oja centers and centers of gravity , 2013, Comput. Geom..

[44]  Pat Morin,et al.  Output-sensitive algorithms for Tukey depth and related problems , 2008, Stat. Comput..

[45]  G.S. Brodal,et al.  Dynamic planar convex hull , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[46]  N. Chakravarti Some results concerning post-infeasibility analysis , 1994 .

[47]  Pravin M. Vaidya,et al.  An algorithm for linear programming which requires O(((m+n)n2+(m+n)1.5n)L) arithmetic operations , 1990, Math. Program..

[48]  Kenneth L. Clarkson,et al.  Las Vegas algorithms for linear and integer programming when the dimension is small , 1995, JACM.

[49]  Edoardo Amaldi,et al.  The Complexity and Approximability of Finding Maximum Feasible Subsystems of Linear Relations , 1995, Theor. Comput. Sci..

[50]  Martin E. Dyer,et al.  Linear Time Algorithms for Two- and Three-Variable Linear Programs , 1984, SIAM J. Comput..

[51]  Timothy M. Chan,et al.  On Approximate Range Counting and Depth , 2007, SCG '07.

[52]  H. Oja Descriptive Statistics for Multivariate Distributions , 1983 .

[53]  David K. Smith Theory of Linear and Integer Programming , 1987 .

[54]  Peter A. Beling,et al.  Using Fast Matrix Multiplication to Find Basic Solutions , 1998, Theoretical Computer Science.

[55]  Emo Welzl,et al.  On Spanning Trees with Low Crossing Numbers , 1992, Data Structures and Efficient Algorithms.

[56]  J. Tukey Mathematics and the Picturing of Data , 1975 .

[57]  C. C. Gonzaga,et al.  An algorithm for solving linear programming programs in O(n3L) operations , 1988 .

[58]  Michael R. Fellows,et al.  Parameterized Complexity , 1998 .

[59]  Alicia Nieto-Reyes,et al.  The random Tukey depth , 2007, Comput. Stat. Data Anal..

[60]  Frederick S. Hillier,et al.  Introduction of Operations Research , 1967 .

[61]  P. Cortez,et al.  A data mining approach to predict forest fires using meteorological data , 2007 .

[62]  Erik D. Demaine,et al.  Tight bounds for dynamic convex hull queries (again) , 2007, SCG '07.

[63]  Joseph O'Rourke,et al.  Handbook of Discrete and Computational Geometry, Second Edition , 1997 .

[64]  P. Orponen,et al.  Computation of the multivariate Oja median , 2003 .

[65]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[66]  Gert Vegter,et al.  In handbook of discrete and computational geometry , 1997 .

[67]  P. Rousseeuw,et al.  Breakdown Points of Affine Equivariant Estimators of Multivariate Location and Covariance Matrices , 1991 .

[68]  J. Matoušek,et al.  On geometric optimization with few violated constraints , 1994, SCG '94.

[69]  Mark H. Overmars,et al.  On a Class of O(n2) Problems in Computational Geometry , 1995, Comput. Geom..

[70]  Rand R. Wilcox,et al.  Approximating Tukey's Depth , 2003 .

[71]  Regina Y. Liu,et al.  Multivariate analysis by data depth: descriptive statistics, graphics and inference, (with discussion and a rejoinder by Liu and Singh) , 1999 .

[72]  Kenneth L. Clarkson A bound on local minima of arrangements that implies the upper bound theorem , 1993, Discret. Comput. Geom..

[73]  Philip D. Plowright,et al.  Convexity , 2019, Optimization for Chemical and Biochemical Engineering.

[74]  Jean-Philippe Vial,et al.  A polynomial method of approximate centers for linear programming , 1992, Math. Program..

[75]  Rajeev Raman,et al.  Sorting in linear time? , 1995, STOC '95.

[76]  Bhaswar B. Bhattacharya,et al.  The Projection Median of a Set of Points in ℝd , 2012, CCCG.

[77]  Timothy M. Chan An optimal randomized algorithm for maximum Tukey depth , 2004, SODA '04.

[78]  Herbert Edelsbrunner,et al.  Algorithms in Combinatorial Geometry , 1987, EATCS Monographs in Theoretical Computer Science.

[79]  D. T. Lee,et al.  Finding the diameter of a set of lines , 1985, Pattern Recognit..

[80]  D. Donoho,et al.  Breakdown Properties of Location Estimates Based on Halfspace Depth and Projected Outlyingness , 1992 .

[81]  Godfried T. Toussaint,et al.  Algorithms for bivariate medians and a fermat-torricelli problem for lines , 2001, CCCG.

[82]  Pat Morin,et al.  Absolute approximation of Tukey depth: Theory and experiments , 2013, Comput. Geom..

[83]  Pat Morin,et al.  Algorithms for Bivariate Majority Depth , 2011, CCCG.

[84]  Timothy M. Chan Low-Dimensional Linear Programming with Violations , 2005, SIAM J. Comput..

[85]  Tamal K. Dey,et al.  Improved Bounds for Planar k -Sets and Related Problems , 1998, Discret. Comput. Geom..

[86]  J. Matou Sek,et al.  Computing the center of planar point sets , 1991 .

[87]  Micha Sharir,et al.  A Combinatorial Bound for Linear Programming and Related Problems , 1992, STACS.

[88]  Dan Chen,et al.  A Branch and Cut Algorithm for the Halfspace Depth Problem , 2007, ArXiv.

[89]  Jiri Matousek,et al.  Lectures on discrete geometry , 2002, Graduate texts in mathematics.

[90]  Paul F. Dietz Optimal Algorithms for List Indexing and Subset Rank , 1989, WADS.

[91]  K. Nordhausen,et al.  OjaNP: Multivariate Methods Based on the Oja Median and Related Concepts , 2010 .

[92]  M. Shamos Geometry and statistics: problems at the interface , 1976 .