Model-based statistical depth with applications to functional data

Statistical depth, a commonly used analytic tool in non-parametric statistics, has been extensively studied for multivariate and functional observations over the past few decades. Although various forms of depth were introduced, they are mainly procedure-based whose definitions are independent of the generative model for observations. To address this problem, we introduce a generative model-based approach to define statistical depth for both multivariate and functional data. The proposed model-based depth framework permits simple computation via Monte Carlo sampling and improves the depth estimation accuracy. When applied to functional data, the proposed depth can capture important features such as continuity, smoothness, or phase variability, depending on the defining criteria. Specifically, we view functional data as realizations from a second-order stochastic process, and define their depths through the eigensystem of the covariance operator. These new definitions are given through a proper metric related to the reproducing kernel Hilbert space of the covariance operator. We propose efficient algorithms to compute the proposed depths and establish estimation consistency. Through simulations and real data, we demonstrate that the proposed functional depths reveal important statistical information such as those captured by the median and quantiles, and detect outliers.

[1]  Martin Guha,et al.  Encyclopedia of Statistics in Behavioral Science , 2006 .

[2]  J. Ramsay,et al.  Curve registration , 2018, Oxford Handbooks Online.

[3]  H. Battey,et al.  A topologically valid definition of depth for functional data , 2014, 1410.5686.

[4]  P. Chaudhuri,et al.  Some intriguing properties of Tukey's half-space depth , 2012, 1201.1171.

[5]  R. Fraiman,et al.  Trimmed means for functional data , 2001 .

[6]  V. Barnett The Ordering of Multivariate Data , 1976 .

[7]  R. Serfling,et al.  General notions of statistical depth function , 2000 .

[8]  Alicia Nieto-Reyes,et al.  The random Tukey depth , 2007, Comput. Stat. Data Anal..

[9]  Regina Y. Liu On a Notion of Data Depth Based on Random Simplices , 1990 .

[10]  D. Donoho,et al.  Breakdown Properties of Location Estimates Based on Halfspace Depth and Projected Outlyingness , 1992 .

[11]  N. Tran AN INTRODUCTION TO THEORETICAL PROPERTIES OF FUNCTIONAL PRINCIPAL COMPONENT ANALYSIS , 2008 .

[12]  Peter Rousseeuw,et al.  Computing location depth and regression depth in higher dimensions , 1998, Stat. Comput..

[13]  Florence Nicol,et al.  Functional principal component analysis of aircraft trajectories , 2013 .

[14]  Juan Romo,et al.  A half-region depth for functional data , 2011, Comput. Stat. Data Anal..

[15]  J. Romo,et al.  On the Concept of Depth for Functional Data , 2009 .

[16]  Wei Wu,et al.  Robust template estimation for functional data with phase variability using band depth , 2018, Comput. Stat. Data Anal..

[17]  P. Rousseeuw,et al.  Bivariate location depth , 1996 .

[18]  Jun Li,et al.  Bridging Centrality and Extremity: Refining Empirical Data Depth Using Extreme Value Statistics , 2015, 1510.08694.

[19]  Ross T. Whitaker,et al.  Contour Boxplots: A Method for Characterizing Uncertainty in Feature Sets from Simulation Ensembles , 2013, IEEE Transactions on Visualization and Computer Graphics.

[20]  Antonio Balzanella,et al.  A Depth Function for Geostatistical Functional Data , 2015 .

[21]  Y. Zuo Projection-based depth functions and associated medians , 2003 .

[22]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[23]  Wei Wu,et al.  An efficient multiple protein structure comparison method and its application to structure clustering and outlier detection , 2013, 2013 IEEE International Conference on Bioinformatics and Biomedicine.

[24]  Jianhua Z. Huang,et al.  A Study of Functional Depths , 2015, 1506.01332.

[25]  Regina Y. Liu,et al.  Multivariate analysis by data depth: descriptive statistics, graphics and inference, (with discussion and a rejoinder by Liu and Singh) , 1999 .

[26]  Yijun Zuo A new approach for the computation of halfspace depth in high dimensions , 2019, Commun. Stat. Simul. Comput..

[27]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[28]  Ioannis Karatzas,et al.  Brownian Motion and Stochastic Calculus , 1987 .

[29]  H. Müller,et al.  Functional Convex Averaging and Synchronization for Time-Warped Random Curves , 2004 .

[30]  Felipe Cucker,et al.  Learning Theory: An Approximation Theory Viewpoint: Index , 2007 .

[31]  Jean Meloche,et al.  Multivariate density estimation by probing depth , 1997 .

[32]  H. Müller,et al.  Pairwise curve synchronization for functional data , 2008 .

[33]  Cun-Hui Zhang,et al.  The multivariate L1-median and associated data depth. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Claudio Agostinelli,et al.  Ordering Curves by Data Depth , 2013, Statistical Models for Data Analysis.

[35]  P. Chaudhuri,et al.  The spatial distribution in infinite dimensional spaces and related quantiles and depths , 2014, 1402.3480.

[36]  D. Bosq Linear Processes in Function Spaces: Theory And Applications , 2000 .

[37]  J. Marron,et al.  Registration of Functional Data Using Fisher-Rao Metric , 2011, 1103.3817.

[38]  T. Hsing,et al.  Theoretical foundations of functional data analysis, with an introduction to linear operators , 2015 .

[39]  Vijayan N. Nair,et al.  Extremal Depth for Functional Data and Applications , 2015, 1511.00128.

[40]  J. Dauxois,et al.  Asymptotic theory for the principal component analysis of a vector random function: Some applications to statistical inference , 1982 .

[41]  I. Gijbels,et al.  On a General Definition of Depth for Functional Data , 2017 .

[42]  Grace Wahba,et al.  Spline Models for Observational Data , 1990 .

[43]  J. Mercer Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[44]  Jianhua Z. Huang,et al.  Are Functional Depths Functional , 2015 .

[45]  A. Nieto-Reyes On the Properties of Functional Depth , 2011 .

[46]  A. Christmann Classification Based on the Support Vector Machine and on Regression Depth , 2002 .

[47]  James O. Ramsay,et al.  Functional Data Analysis , 2005 .