A Decomposition of Total Variation Depth for Understanding Functional Outliers

Abstract There has been extensive work on data depth-based methods for robust multivariate data analysis. Recent developments have moved to infinite-dimensional objects, such as functional data. In this work, we propose a notion of depth, the total variation depth, for functional data, which has many desirable features and is well suited for outlier detection. The proposed depth is in the form of an integral of a univariate depth function. We show that the novel formation of the total variation depth leads to useful decomposition associated with shape and magnitude outlyingness of functional data. Compared to magnitude outliers, shape outliers are often masked among the rest of samples and more difficult to identify. We then further develop an effective procedure and visualization tools for detecting both types of outliers, while naturally accounting for the correlation in functional data. The outlier detection performance is investigated through simulations under various outlier models. Finally, the proposed methodology is demonstrated using real datasets of curves, images, and video frames.

[1]  M. Genton,et al.  Functional Boxplots , 2011 .

[2]  Sara López-Pintado,et al.  Simplicial band depth for multivariate functional data , 2014, Adv. Data Anal. Classif..

[3]  Pedro Galeano,et al.  Spatial depth-based classification for functional data , 2013, 1305.2957.

[4]  J. Romo,et al.  On the Concept of Depth for Functional Data , 2009 .

[5]  Rebecka Jörnsten Clustering and classification based on the L 1 data depth , 2004 .

[6]  Rob J Hyndman,et al.  Rainbow Plots, Bagplots, and Boxplots for Functional Data , 2010 .

[7]  P. Chaudhuri,et al.  Some intriguing properties of Tukey's half-space depth , 2012, 1201.1171.

[8]  Alicia Nieto-Reyes,et al.  The random Tukey depth , 2007, Comput. Stat. Data Anal..

[9]  Regina Y. Liu,et al.  Multivariate analysis by data depth: descriptive statistics, graphics and inference, (with discussion and a rejoinder by Liu and Singh) , 1999 .

[10]  R. Fraiman,et al.  Trimmed means for functional data , 2001 .

[11]  Pedro Galeano,et al.  Functional outlier detection by a local depth with application to NOx levels , 2014, Stochastic Environmental Research and Risk Assessment.

[12]  Juan Romo,et al.  Shape outlier detection and visualization for functional data: the outliergram. , 2013, Biostatistics.

[13]  P. Rousseeuw,et al.  The Bagplot: A Bivariate Boxplot , 1999 .

[14]  Rob J Hyndman,et al.  Computing and Graphing Highest Density Regions , 1996 .

[15]  M. Febrero,et al.  Outlier detection in functional data by depth measures, with application to identify abnormal NOx levels , 2008 .

[16]  Marcela Svarc,et al.  Principal components for multivariate functional data , 2011 .

[17]  P. Chaudhuri,et al.  The spatial distribution in infinite dimensional spaces and related quantiles and depths , 2014, 1402.3480.

[18]  Irène Gijbels,et al.  Integrated depth for functional data: statistical properties and consistency , 2016 .

[19]  M. Hubert,et al.  Multivariate Functional Halfspace Depth , 2012 .

[20]  Frédéric Ferraty,et al.  Nonparametric Functional Data Analysis: Theory and Practice (Springer Series in Statistics) , 2006 .

[21]  Vijayan N. Nair,et al.  Extremal Depth for Functional Data and Applications , 2015, 1511.00128.

[22]  Mia Hubert,et al.  Multivariate and functional classification using depth and distance , 2017, Adv. Data Anal. Classif..

[23]  R. Serfling,et al.  Nonparametric depth-based multivariate outlier identifiers, and masking robustness properties , 2010 .

[24]  Mia Hubert,et al.  An adjusted boxplot for skewed distributions , 2008, Comput. Stat. Data Anal..

[25]  Regina Y. Liu On a Notion of Data Depth Based on Random Simplices , 1990 .

[26]  Z. Q. John Lu,et al.  Nonparametric Functional Data Analysis: Theory And Practice , 2007, Technometrics.

[27]  Brani Vidakovic,et al.  Wavelets in Functional Data Analysis , 2017 .

[28]  Ricardo Fraiman,et al.  Robust estimation and classification for functional data via projection-based depth notions , 2007, Comput. Stat..

[29]  Anil K. Ghosh,et al.  On Maximum Depth and Related Classifiers , 2005 .

[30]  R. Serfling,et al.  General notions of statistical depth function , 2000 .

[31]  A. Veneziani,et al.  A Case Study in Exploratory Functional Data Analysis: Geometrical Features of the Internal Carotid Artery , 2009 .

[32]  Francesca Ieva,et al.  Depth Measures for Multivariate Functional Data , 2013 .

[33]  Juan Romo,et al.  Depth-based classification for functional data , 2005, Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications.

[34]  Laura M. Sangalli,et al.  Wavelets in functional data analysis: Estimation of multidimensional curves and their derivatives , 2012, Comput. Stat. Data Anal..

[35]  Anil K. Ghosh,et al.  On robust classification using projection depth , 2011, Annals of the Institute of Statistical Mathematics.

[36]  Probal Chaudhuri,et al.  On data depth in infinite dimensional spaces , 2014, 1402.2775.

[37]  Kristin Potter,et al.  Surface boxplots. , 2014, Stat.

[38]  Juan Romo,et al.  A half-region depth for functional data , 2011, Comput. Stat. Data Anal..

[39]  Ying Sun,et al.  Surface boxplots , 2014, Stat.

[40]  Marc G. Genton,et al.  Directional outlyingness for multivariate functional data , 2016, Comput. Stat. Data Anal..

[41]  Mia Hubert,et al.  A Measure of Directional Outlyingness With Applications to Image Data and Video , 2016, 1608.05012.

[42]  Marc G. Genton,et al.  Adjusted functional boxplots for spatio‐temporal data visualization and outlier detection , 2012 .

[43]  D. Nychka,et al.  Exact fast computation of band depth for large functional datasets: How quickly can one million curves be ranked? , 2012 .

[44]  R. Serfling A Depth Function and a Scale Curve Based on Spatial Quantiles , 2002 .

[45]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[46]  Qi Tian,et al.  Statistical modeling of complex backgrounds for foreground object detection , 2004, IEEE Transactions on Image Processing.