论文信息 - The Catline for Deep Regression - 字舞流文

The Catline for Deep Regression

Motivated by the notion of regression depth (Rousseeuw and Hubert, 1996) we introduce thecatline, a new method for simple linear regression. At any bivariate data setZn={(xi,yi);i=1,?,n} its regression depth is at leastn/3. This lower bound is attained for data lying on a convex or concave curve, whereas for perfectly linear data the catline attains a depth ofn. We construct anO(nlogn) algorithm for the catline, so it can be computed fast in practice. The catline is Fisher-consistent at any linear modely=sx+?+ein which the error distribution satisfies med(e|x)=0, which encompasses skewed and/or heteroscedastic errors. The breakdown value of the catline is 1/3, and its influence function is bounded. At the bivariate gaussian distribution its asymptotic relative efficiency compared to theL1line is 79.3% for the slope, and 88.9% for the intercept. The finite-sample relative efficiencies are in close agreement with these values. This combination of properties makes the catline an attractive fitting method.

Mia Hubert | Peter J. Rousseeuw | M. Hubert | P. Rousseeuw

[1] G. Wang,et al. Convergence of depth contours for multivariate datasets , 1997 .

[2] D. Ruppert,et al. Transformation and Weighting in Regression , 1988 .

[3] Regina Y. Liu. Control Charts for Multivariate Processes , 1995 .

[4] M. Braga,et al. Exploratory Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[5] W. R. Buckland,et al. Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability. , 1952 .

[6] J. H. Wilkinson. Two algorithms based on successive linear interpolation , 1967 .

[7] R. Koenker,et al. Asymptotic Theory of Least Absolute Error Regression , 1978 .

[8] D. Cox,et al. An Analysis of Transformations , 1964 .

[9] Peter Henrici,et al. Constructive aspects of the fundamental theorem of algebra : proceedings of a symposium conducted at the IBM Research Laboratory, Zürich-Rüschlikon, Switzerland, June 5-7, 1967 , 1972 .

[10] David J. Hand,et al. A Handbook of Small Data Sets , 1993 .

[11] P. Rousseeuw,et al. Sensitivity functions and numerical analysis of the repeated median slope , 1995 .

[12] D. Anderson,et al. Algorithms for minimization without derivatives , 1974 .

[13] Peter J. Rousseeuw,et al. Robust regression and outlier detection , 1987 .

[14] G. W. Brown,et al. On Median Tests for Linear Hypotheses , 1951 .

[15] Regina Y. Liu. On a Notion of Data Depth Based on Random Simplices , 1990 .

[16] Iain M. Johnstone,et al. The Resistant Line and Related Regression Methods , 1985 .

[17] D. Donoho,et al. Breakdown Properties of Location Estimates Based on Halfspace Depth and Projected Outlyingness , 1992 .

[18] Raymond J. Carroll,et al. A Note on Asymmetry and Robustness in Linear Regression , 1988 .

[19] Brian S. Cade,et al. PERMUTATION TESTS FOR LEAST ABSOLUTE DEVIATION REGRESSION , 1996 .

[20] Richard Cole,et al. Slowing down sorting networks to obtain faster sorting algorithms , 2015, JACM.

[21] Herbert Edelsbrunner,et al. Algorithms in Combinatorial Geometry , 1987, EATCS Monographs in Theoretical Computer Science.

[22] Herbert Edelsbrunner,et al. Computing a Ham-Sandwich Cut in Two Dimensions , 1986, J. Symb. Comput..

[23] C. Jennison,et al. Robust Statistics: The Approach Based on Influence Functions , 1987 .

[24] J. Tukey. Mathematics and the Picturing of Data , 1975 .