On Maximum Depth and Related Classifiers

Over the last couple of decades, data depth has emerged as a powerful exploratory and inferential tool for multivariate data analysis with wide-spread applications. This paper investigates the possible use of different notions of data depth in non-parametric discriminant analysis. First, we consider the situation where the prior probabilities of the competing populations are all equal and investigate classifiers that assign an observation to the population with respect to which it has the maximum location depth. We propose a different depth-based classification technique for unequal prior problems, which is also useful for equal prior cases, especially when the populations have different scatters and shapes. We use some simulated data sets as well as some benchmark real examples to evaluate the performance of these depth-based classifiers. Large sample behaviour of the misclassification rates of these depth-based non-parametric classifiers have been derived under appropriate regularity conditions.

[1]  G. Wang,et al.  Convergence of depth contours for multivariate datasets , 1997 .

[2]  K. Mosler Multivariate Dispersion, Central Regions, and Depth , 2002 .

[3]  Juan Romo,et al.  Depth-based classification for functional data , 2005, Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications.

[4]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[5]  Jean Meloche,et al.  Multivariate density estimation by probing depth , 1997 .

[6]  C. Croux,et al.  Robust linear discriminant analysis using S‐estimators , 2001 .

[7]  D. Donoho,et al.  Breakdown Properties of Location Estimates Based on Halfspace Depth and Projected Outlyingness , 1992 .

[8]  H. Oja Descriptive Statistics for Multivariate Distributions , 1983 .

[9]  Peter Rousseeuw,et al.  Computing location depth and regression depth in higher dimensions , 1998, Stat. Comput..

[10]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[11]  Cun-Hui Zhang,et al.  The multivariate L1-median and associated data depth. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[12]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[13]  Rebecka Jörnsten Clustering and classification based on the L 1 data depth , 2004 .

[14]  Regina Y. Liu On a Notion of Data Depth Based on Random Simplices , 1990 .

[15]  V. Koltchinskii M-estimation, convexity and quantiles , 1997 .

[16]  Regina Y. Liu,et al.  Regression depth. Commentaries. Rejoinder , 1999 .

[17]  Regina Y. Liu,et al.  A Quality Index Based on Data Depth and Multivariate Rank Tests , 1993 .

[18]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[19]  R. Serfling Approximation Theorems of Mathematical Statistics , 1980 .

[20]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[21]  Steven N. MacEachern,et al.  Classification via kernel product estimators , 1998 .

[22]  Andreas Christmann,et al.  Measuring overlap in binary regression , 2001 .

[23]  D. Pollard Convergence of stochastic processes , 1984 .

[24]  S. Kotz,et al.  Symmetric Multivariate and Related Distributions , 1989 .

[25]  J. Tukey Mathematics and the Picturing of Data , 1975 .

[26]  Anil K. Ghosh,et al.  OPTIMAL SMOOTHING IN KERNEL DISCRIMINANT ANALYSIS , 2004 .

[27]  W. Fung,et al.  High Breakdown Estimation for Multiple Populations with Applications to Discriminant Analysis , 2000 .

[28]  Thorsten Joachims,et al.  Comparison between various regression depth methods and the support vector machine to approximate the minimum number of missclassifications , 2002, Comput. Stat..

[29]  P. Rousseeuw,et al.  A fast algorithm for the minimum covariance determinant estimator , 1999 .

[30]  Cluster Analysis Based on Data Depth , 2000 .

[31]  R. Serfling A Depth Function and a Scale Curve Based on Spatial Quantiles , 2002 .

[32]  P. Chaudhuri,et al.  Sign Tests in Multidimension: Inference Based on the Geometry of the Data Cloud , 1993 .

[33]  Mia Hubert,et al.  Fast and robust discriminant analysis , 2004, Comput. Stat. Data Anal..

[34]  P. Rousseeuw,et al.  Bivariate location depth , 1996 .

[35]  I. Mizera On depth and deep points: a calculus , 2002 .

[36]  R. Serfling,et al.  General notions of statistical depth function , 2000 .

[37]  P. Chaudhuri On a geometric notion of quantiles for multivariate data , 1996 .

[38]  D. Nolan Asymptotics for multivariate trimming , 1992 .

[39]  P. Chaudhuri,et al.  On data depth and distribution-free discriminant analysis using separating surfaces , 2005 .

[40]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .