Modeling Probability Density Functions as Data Objects

Abstract Recent developments in the probabilistic and statistical analysis of probability density functions are reviewed. Density functions are treated as data objects for which suitable notions of the center of distribution and variability are discussed. Special attention is given to nonlinear methods that respect the constraints density functions must obey. Regression, time series and spatial models are discussed. The exposition is illustrated with data examples. A supplementary vignette contains expanded versions of data analyses with accompanying codes.

[1]  Alessandra Menafoglio,et al.  O2S2: A new venue for computational geostatistics , 2019, Applied Computing and Geosciences.

[2]  Yoav Zemel,et al.  An Invitation to Statistics in Wasserstein Space , 2020 .

[3]  J Steve Marron,et al.  Overview of object oriented data analysis , 2014, Biometrical journal. Biometrische Zeitschrift.

[4]  P. Thomas Fletcher,et al.  Principal geodesic analysis for the study of nonlinear statistics of shape , 2004, IEEE Transactions on Medical Imaging.

[5]  E. Hellinger,et al.  Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen. , 1909 .

[6]  Zhen Zhang,et al.  Functional density synchronization , 2011, Comput. Stat. Data Anal..

[7]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[8]  David S. Stoffer,et al.  Time series analysis and its applications , 2000 .

[9]  H. Muller,et al.  Wasserstein covariance for multiple random densities , 2018, Biometrika.

[10]  Wei Wu,et al.  Generative models for functional data using phase and amplitude separation , 2012, Comput. Stat. Data Anal..

[11]  Alessandra Menafoglio,et al.  Random domain decompositions for object-oriented Kriging over complex domains , 2018, Stochastic Environmental Research and Risk Assessment.

[12]  Badih Ghattas,et al.  Classifying densities using functional regression trees: Applications in oceanology , 2007, Comput. Stat. Data Anal..

[13]  Hsing,et al.  Functional Data Analysis , 2015 .

[14]  K. Yamakawa,et al.  Dosage-dependent over-expression of genes in the trisomic region of Ts1Cje mouse model for Down syndrome. , 2004, Human molecular genetics.

[15]  Piotr Kokoszka,et al.  Wasserstein autoregressive models for density time series , 2020, Journal of Time Series Analysis.

[16]  John A. Rice,et al.  Displaying the important features of large collections of similar curves , 1992 .

[17]  V. Pawlowsky-Glahn,et al.  Hilbert Space of Probability Density Functions Based on Aitchison Geometry , 2006 .

[18]  H. Muller,et al.  Fréchet regression for random objects with Euclidean predictors , 2016, The Annals of Statistics.

[19]  Nicolas Papadakis,et al.  Geodesic PCA versus Log-PCA of Histograms in the Wasserstein Space , 2018, SIAM J. Sci. Comput..

[20]  Alberto Guadagnini,et al.  Stochastic simulation of soil particle‐size curves in heterogeneous aquifer systems through a Bayes space approach , 2016 .

[21]  J. S. Marron,et al.  Functional Data Analysis of Amplitude and Phase Variation , 2015, 1512.03216.

[22]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[23]  Leif Ellingson,et al.  Nonparametric Statistics on Manifolds and Their Applications to Object Data Analysis , 2015 .

[24]  P. Kokoszka,et al.  Introduction to Functional Data Analysis , 2017 .

[25]  Piotr Kokoszka,et al.  Forecasting of density functions with an application to cross-sectional and intraday returns , 2019, International Journal of Forecasting.

[26]  J. Marron,et al.  Registration of Functional Data Using Fisher-Rao Metric , 2011, 1103.3817.

[27]  Guillaume Carlier,et al.  Barycenters in the Wasserstein Space , 2011, SIAM J. Math. Anal..

[28]  Alessandra Menafoglio,et al.  Statistical analysis of complex and spatially dependent data: A review of Object Oriented Spatial Statistics , 2017, Eur. J. Oper. Res..

[29]  Alexander Petersen,et al.  Quantifying and Visualizing Intraregional Connectivity in Resting-State Functional Magnetic Resonance Imaging with Correlation Densities , 2019, Brain Connect..

[30]  Carol A. Gotway,et al.  Statistical Methods for Spatial Data Analysis , 2004 .

[31]  H. Muller,et al.  Functional data analysis for density functions by transformation to a Hilbert space , 2016, 1601.02869.

[32]  S. Mayer,et al.  Exploration of Multiparameter Hematoma 3D Image Analysis for Predicting Outcome After Intracerebral Hemorrhage , 2019, Neurocritical Care.

[33]  Alberto Guadagnini,et al.  A Class-Kriging Predictor for Functional Compositions with Application to Particle-Size Curves in Heterogeneous Aquifers , 2016, Mathematical Geosciences.

[34]  Bernhard Schölkopf,et al.  Kernel Mean Embedding of Distributions: A Review and Beyonds , 2016, Found. Trends Mach. Learn..

[35]  V. Pawlowsky-Glahn,et al.  Bayes Hilbert Spaces , 2014 .

[36]  Jérémie Bigot,et al.  Geodesic PCA in the Wasserstein space by Convex PCA , 2017 .

[37]  Helmut Ltkepohl,et al.  New Introduction to Multiple Time Series Analysis , 2007 .

[38]  R. Laha Probability Theory , 1979 .

[39]  Hui Li,et al.  LQD-RKHS-based distribution-to-distribution regression methodology for restoring the probability distributions of missing SHM data , 2018, Mechanical Systems and Signal Processing.

[40]  Alessandra Menafoglio,et al.  SUPP MATERIAL: Profile Monitoring of Probability Density Functions via Simplicial Functional PCA With Application to Image Data , 2018 .

[41]  E. A. Sylvestre,et al.  Principal modes of variation for processes with continuous sample curves , 1986 .

[42]  K. J. Utikal,et al.  Inference for Density Families Using Functional Principal Component Analysis , 2001 .

[43]  Richard A. Davis,et al.  Time Series: Theory and Methods , 2013 .

[44]  Anuj Srivastava,et al.  Riemannian Analysis of Probability Density Functions with Applications in Vision , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Hans-Georg Müller Functional Data Analysis. , 2011 .

[46]  Alessandra Menafoglio,et al.  Compositional regression with functional response , 2018, Comput. Stat. Data Anal..

[47]  Piotr Kokoszka,et al.  Inference for Functional Data with Applications , 2012 .

[48]  Flávio Augusto Ziegelmann,et al.  Dynamics of financial returns densities: A functional approach applied to the Bovespa intraday index , 2018 .

[49]  Alberto Guadagnini,et al.  A kriging approach based on Aitchison geometry for the characterization of particle-size curves in heterogeneous aquifers , 2014, Stochastic Environmental Research and Risk Assessment.

[50]  J. Faraway Regression analysis for a functional response , 1997 .

[51]  H. Müller,et al.  Additive Functional Regression for Densities as Responses , 2020, Journal of the American Statistical Association.

[52]  Jérémie Bigot,et al.  Upper and lower risk bounds for estimating the Wasserstein barycenter of random measures on the real line , 2018 .

[53]  T. Hsing,et al.  Theoretical foundations of functional data analysis, with an introduction to linear operators , 2015 .

[54]  Chang Sik Kim,et al.  Nonstationarity in time series of state densities , 2016 .

[55]  Pedro Delicado,et al.  Dimensionality reduction when data are density functions , 2011, Comput. Stat. Data Anal..

[56]  Jane-Ling Wang,et al.  From sparse to dense functional data and beyond , 2016 .

[57]  Peter Filzmoser,et al.  Simplicial principal component analysis for density functions in Bayes spaces , 2016, Comput. Stat. Data Anal..

[58]  R. Lyons Distance covariance in metric spaces , 2011, 1106.5758.

[59]  F. E. Satterthwaite Synthesis of variance , 1941 .

[60]  Victor M. Panaretos,et al.  Fréchet means and Procrustes analysis in Wasserstein space , 2017, Bernoulli.

[61]  D. Bosq Linear Processes in Function Spaces: Theory And Applications , 2000 .

[62]  Brendan K. Beare,et al.  Cointegrated linear processes in Bayes Hilbert space , 2019, Statistics & Probability Letters.

[63]  Won-Ki Seo Cointegrated Density-Valued Linear Processes. , 2017, 1710.07792.

[64]  L. Ambrosio,et al.  Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .

[65]  J. Egozcue,et al.  Weighting the domain of probability densities in functional data analysis , 2020, Stat.

[66]  Yoav Zemel,et al.  Statistical Aspects of Wasserstein Distances , 2018, Annual Review of Statistics and Its Application.

[67]  Maria L. Rizzo,et al.  Energy statistics: A class of statistics based on distances , 2013 .

[68]  Victor M. Panaretos,et al.  Amplitude and phase variation of point processes , 2016, 1603.08691.