A Geometric Approach to Visualization of Variability in Functional Data

ABSTRACT We propose a new method for the construction and visualization of boxplot-type displays for functional data. We use a recent functional data analysis framework, based on a representation of functions called square-root slope functions, to decompose observed variation in functional data into three main components: amplitude, phase, and vertical translation. We then construct separate displays for each component, using the geometry and metric of each representation space, based on a novel definition of the median, the two quartiles, and extreme observations. The outlyingness of functional data is a very complex concept. Thus, we propose to identify outliers based on any of the three main components after decomposition. We provide a variety of visualization tools for the proposed boxplot-type displays including surface plots. We evaluate the proposed method using extensive simulations and then focus our attention on three real data applications including exploratory data analysis of sea surface temperature functions, electrocardiogram functions, and growth curves. Supplementary materials for this article are available online.

[1]  M. Febrero,et al.  Outlier detection in functional data by depth measures, with application to identify abnormal NOx levels , 2008 .

[2]  John W. Tukey,et al.  Exploratory Data Analysis. , 1979 .

[3]  H. Le,et al.  Locating Fréchet means with application to shape spaces , 2001, Advances in Applied Probability.

[4]  Regina Y. Liu On a Notion of Data Depth Based on Random Simplices , 1990 .

[5]  K. Mardia,et al.  Statistical Shape Analysis , 1998 .

[6]  Ralf Bousseljot,et al.  Nutzung der EKG-Signaldatenbank CARDIODAT der PTB über das Internet , 2009 .

[7]  Probal Chaudhuri,et al.  On data depth in infinite dimensional spaces , 2014, 1402.2775.

[8]  Laurent Younes,et al.  Computable Elastic Distances Between Shapes , 1998, SIAM J. Appl. Math..

[9]  Suresh Venkatasubramanian,et al.  The geometric median on Riemannian manifolds with application to robust atlas estimation , 2009, NeuroImage.

[10]  Gary E. Christensen,et al.  Segmentation, alignment and statistical analysis of biosignals with application to disease classification , 2013 .

[11]  Anuj Srivastava,et al.  Parameterization-Invariant Shape Comparisons of Anatomical Surfaces , 2011, IEEE Transactions on Medical Imaging.

[12]  H. Oja Descriptive Statistics for Multivariate Distributions , 1983 .

[13]  J. S. Marron,et al.  Functional Data Analysis of Amplitude and Phase Variation , 2015, 1512.03216.

[14]  R. Fraiman,et al.  Trimmed means for functional data , 2001 .

[15]  Jean Meloche,et al.  Multivariate L-estimation , 1999 .

[16]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[17]  Mia Hubert,et al.  Multivariate functional outlier detection , 2015, Statistical Methods & Applications.

[18]  J. Marron,et al.  Registration of Functional Data Using Fisher-Rao Metric , 2011, 1103.3817.

[19]  Anuj Srivastava,et al.  Shape Analysis of Elastic Curves in Euclidean Spaces , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Francesca Ieva,et al.  Depth Measures for Multivariate Functional Data , 2013 .

[21]  Wenceslao González-Manteiga,et al.  A functional analysis of NOx levels: location and scale estimation and outlier detection , 2007, Comput. Stat..

[22]  Wen Cheng,et al.  Bayesian Registration of Functions and Curves , 2013, 1311.2105.

[23]  J. S. Marron,et al.  Statistical atlas construction via weighted functional boxplots , 2014, Medical Image Anal..

[24]  J. Romo,et al.  On the Concept of Depth for Functional Data , 2009 .

[25]  Rob J Hyndman,et al.  Rainbow Plots, Bagplots, and Boxplots for Functional Data , 2010 .

[26]  Malcolm Sabin Functions and Curves , 2010 .

[27]  M. Braga,et al.  Exploratory Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[28]  James Kuelbs,et al.  Concerns with Functional Depth , 2013, 1310.0045.

[29]  Ricardo Fraiman,et al.  Quantiles for finite and infinite dimensional data , 2012, J. Multivar. Anal..

[30]  Wei Wu,et al.  Generative models for functional data using phase and amplitude separation , 2012, Comput. Stat. Data Anal..

[31]  Anuj Srivastava,et al.  Riemannian Analysis of Probability Density Functions with Applications in Vision , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Alain Trouvé,et al.  Computing Large Deformation Metric Mappings via Geodesic Flows of Diffeomorphisms , 2005, International Journal of Computer Vision.

[33]  Jeffrey M. Hausdorff,et al.  Physionet: Components of a New Research Resource for Complex Physiologic Signals". Circu-lation Vol , 2000 .

[34]  Anuj Srivastava,et al.  Elastic Shape Matching of Parameterized Surfaces Using Square Root Normal Fields , 2012, ECCV.

[35]  J. Ramsay,et al.  Combining Registration and Fitting for Functional Models , 2008 .

[36]  H. Müller,et al.  Pairwise curve synchronization for functional data , 2008 .

[37]  M. Genton,et al.  Functional Boxplots , 2011 .

[38]  Marc G. Genton,et al.  Adjusted functional boxplots for spatio‐temporal data visualization and outlier detection , 2012 .

[39]  Cun-Hui Zhang,et al.  The multivariate L1-median and associated data depth. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[40]  M. Hubert,et al.  Multivariate Functional Halfspace Depth , 2012 .

[41]  Sara López-Pintado,et al.  Simplicial band depth for multivariate functional data , 2014, Adv. Data Anal. Classif..

[42]  Eric Klassen,et al.  Precise Matching of PL Curves in $R^N$ in the Square Root Velocity Framework , 2015, 1501.00577.