Adjusted functional boxplots for spatio‐temporal data visualization and outlier detection

This article proposes a simulation-based method to adjust functional boxplots for correlations when visualizing functional and spatio-temporal data, as well as detecting outliers. We start by investigating the relationship between the spatiotemporal dependence and the 1.5 times the 50% central region empirical outlier detection rule. Then, we propose to simulate observations without outliers on the basis of a robust estimator of the covariance function of the data. We select the constant factor in the functional boxplot to control the probability of correctly detecting no outliers. Finally, we apply the selected factor to the functional boxplot of the original data. As applications, the factor selection procedure and the adjusted functional boxplots are demonstrated on sea surface temperatures, spatio-temporal precipitation and general circulation model (GCM) data. The outlier detection performance is also compared before and after the factor adjustment. Copyright © 2011 John Wiley & Sons, Ltd.

[1]  P. Rousseeuw Multivariate estimation with high breakdown point , 1985 .

[2]  P. Guttorp,et al.  Geostatistical Space-Time Models, Stationarity, Separability, and Full Symmetry , 2007 .

[3]  Ruben H. Zamar,et al.  Robust Estimates of Location and Dispersion for High-Dimensional Datasets , 2002, Technometrics.

[4]  M. Hubert,et al.  High-Breakdown Robust Multivariate Methods , 2008, 0808.0657.

[5]  M. Genton,et al.  Highly Robust Estimation of the Autocovariance Function , 2000 .

[6]  M. Greenwood Nonparametric Functional Data Analysis: Theory and Practice , 2007 .

[7]  Regina Y. Liu,et al.  Multivariate analysis by data depth: descriptive statistics, graphics and inference, (with discussion and a rejoinder by Liu and Singh) , 1999 .

[8]  Victor J. Yohai,et al.  Robust Low-Rank Approximation of Data Matrices With Elementwise Contamination , 2008, Technometrics.

[9]  Noel A Cressie,et al.  Statistics for Spatio-Temporal Data , 2011 .

[10]  W. Collins,et al.  The Community Climate System Model Version 3 (CCSM3) , 2006 .

[11]  J. Romo,et al.  On the Concept of Depth for Functional Data , 2009 .

[12]  M. Genton,et al.  Functional Boxplots , 2011 .

[13]  Marc G. Genton,et al.  Highly Robust Variogram Estimation , 1998 .

[14]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[15]  Rob J Hyndman,et al.  Rainbow Plots, Bagplots, and Boxplots for Functional Data , 2010 .

[16]  P. Rousseeuw Least Median of Squares Regression , 1984 .

[17]  O. Edenhofer,et al.  Intergovernmental Panel on Climate Change (IPCC) , 2013 .

[18]  Ursula Gather,et al.  The largest nonindentifiable outlier: a comparison of multivariate simultaneous outlier identification rules , 2001 .

[19]  John W. Tukey,et al.  Exploratory Data Analysis. , 1979 .

[20]  Wenceslao González-Manteiga,et al.  A functional analysis of NOx levels: location and scale estimation and outlier detection , 2007, Comput. Stat..

[21]  M. Febrero,et al.  Outlier detection in functional data by depth measures, with application to identify abnormal NOx levels , 2008 .

[22]  M. Genton,et al.  Highly Robust Estimation of Dispersion Matrices , 2001 .

[23]  Ursula Gather,et al.  The Masking Breakdown Point of Multivariate Outlier Identification Rules , 1999 .

[24]  Caspar M. Ammann,et al.  Climate engineering through artificial enhancement of natural forcings: Magnitudes and implied consequences , 2010 .