The deepest point for distributions in infinite dimensional spaces

Abstract Identification of the center of a data cloud is one of the basic problems in statistics. One popular choice for such a center is the median, and several versions of median in finite dimensional spaces have been studied in the literature. In particular, medians based on different notions of data depth have been extensively studied by many researchers, who defined median as the point, where the depth function attains its maximum value. In other words, the median is the deepest point in the sample space according to that definition. In this paper, we investigate the deepest point for probability distributions in infinite dimensional spaces. We show that for some well-known depth functions like the band depth and the half-region depth in function spaces, there may not be any meaningful deepest point for many well-known and commonly used probability models. On the other hand, certain modified versions of those depth functions as well as the spatial depth function, which can be defined in any Hilbert space, lead to some useful notions of the deepest point with nice geometric and statistical properties. The empirical versions of those deepest points can be conveniently computed for functional data, and we demonstrate this using some simulated and real datasets.

[1]  B. Cadre Convergent estimators for the l1-median of banach valued random variable , 2001 .

[2]  M. Kosorok,et al.  Marginal asymptotics for the “large $p$, small $n$” paradigm: With applications to microarray data , 2005, math/0508219.

[3]  Cun-Hui Zhang,et al.  The multivariate L1-median and associated data depth. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[4]  James Kuelbs,et al.  Half-region depth for stochastic processes , 2015, J. Multivar. Anal..

[5]  R. Serfling A Depth Function and a Scale Curve Based on Spatial Quantiles , 2002 .

[6]  La multi-application médianes conditionnelles , 1984 .

[7]  J. Romo,et al.  On the Concept of Depth for Functional Data , 2009 .

[8]  Regina Y. Liu,et al.  Multivariate analysis by data depth: descriptive statistics, graphics and inference, (with discussion and a rejoinder by Liu and Singh) , 1999 .

[9]  P. Zitt,et al.  Efficient and fast estimation of the geometric median in Hilbert spaces with an averaged stochastic gradient algorithm , 2011, 1101.4316.

[10]  D. Donoho,et al.  Breakdown Properties of Location Estimates Based on Halfspace Depth and Projected Outlyingness , 1992 .

[11]  Probal Chaudhuri,et al.  On data depth in infinite dimensional spaces , 2014, 1402.2775.

[12]  C. Small A Survey of Multidimensional Medians , 1990 .

[13]  M. Yor,et al.  Continuous martingales and Brownian motion , 1990 .

[14]  D. Gervini Robust functional estimation using the median and spherical principal components , 2008 .

[15]  B. M. Brown,et al.  Statistical Uses of the Spatial Median , 1983 .

[16]  R. Fraiman,et al.  Trimmed means for functional data , 2001 .

[17]  Björn Böttcher,et al.  Feller Processes: The Next Generation in Modeling. Brownian Motion, Lévy Processes and Beyond , 2010, PloS one.

[18]  Juan Romo,et al.  A half-region depth for functional data , 2011, Comput. Stat. Data Anal..

[19]  R. R. Bahadur,et al.  Statistics and probability : a Raghu Raj Bahadur festschrift , 1993 .

[20]  G. Maguluri,et al.  19 On the fundamentals of data robustness , 1997 .

[21]  P. Chaudhuri,et al.  On the statistical efficiency of robust estimators of multivariate location , 2011 .

[22]  F. Bedall,et al.  Algorithm AS 143: The Mediancentre , 1979 .

[23]  R. Serfling,et al.  General notions of statistical depth function , 2000 .

[24]  E. Asplund Fréchet differentiability of convex functions , 1968 .