Functional CAR Models for Large Spatially Correlated Functional Datasets

ABSTRACT We develop a functional conditional autoregressive (CAR) model for spatially correlated data for which functions are collected on areal units of a lattice. Our model performs functional response regression while accounting for spatial correlations with potentially nonseparable and nonstationary covariance structure, in both the space and functional domains. We show theoretically that our construction leads to a CAR model at each functional location, with spatial covariance parameters varying and borrowing strength across the functional domain. Using basis transformation strategies, the nonseparable spatial-functional model is computationally scalable to enormous functional datasets, generalizable to different basis functions, and can be used on functions defined on higher dimensional domains such as images. Through simulation studies, we demonstrate that accounting for the spatial correlation in our modeling leads to improved functional regression performance. Applied to a high-throughput spatially correlated copy number dataset, the model identifies genetic markers not identified by comparable methods that ignore spatial correlations. Supplementary materials for this article are available online.

[1]  Jorge Mateu,et al.  Statistics for spatial functional data: some recent contributions , 2009 .

[2]  M. Wall A close look at the spatial structure implied by the CAR and SAR models , 2004 .

[3]  Christopher K. Wikle,et al.  Ecological Prediction With Nonlinear Multivariate Time-Frequency Functional Data Models , 2013 .

[4]  P. Müller,et al.  Optimal Sample Size for Multiple Testing , 2004 .

[5]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[6]  R. Pepperkok,et al.  Systematic Subcellular Localization of Novel Proteins , 2006 .

[7]  L Knorr-Held,et al.  Modelling risk from a disease in time and space. , 1998, Statistics in medicine.

[8]  Noel A Cressie,et al.  Statistics for Spatio-Temporal Data , 2011 .

[9]  Hongxiao Zhu,et al.  Robust, Adaptive Functional Regression in Functional Mixed Model Framework , 2011, Journal of the American Statistical Association.

[10]  D. Clayton,et al.  Bayesian analysis of space-time variation in disease risk. , 1995, Statistics in medicine.

[11]  Kun Huang,et al.  Space–time latent component modeling of geo‐referenced health data , 2010, Statistics in medicine.

[12]  Jeffrey S. Morris,et al.  AUTOMATED ANALYSIS OF QUANTITATIVE IMAGE DATA USING ISOMORPHIC FUNCTIONAL MIXED MODELS, WITH APPLICATION TO PROTEOMICS DATA. , 2011, The annals of applied statistics.

[13]  Jeffrey S. Morris,et al.  Wavelet‐based functional mixed models , 2006, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[14]  John D. Storey The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .

[15]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[16]  Douglas Grove,et al.  Denoising array-based comparative genomic hybridization data using wavelets. , 2005, Biostatistics.

[17]  Fang Yao,et al.  Structured functional additive regression in reproducing kernel Hilbert spaces , 2014, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[18]  Piotr Kokoszka,et al.  Nonparametric inference in small data sets of spatially indexed curves with application to ionospheric trend determination , 2013, Comput. Stat. Data Anal..

[19]  Ana-Maria Staicu,et al.  Fast methods for spatially correlated multilevel functional data. , 2010, Biostatistics.

[20]  T. Anderson Statistical analysis of time series , 1974 .

[21]  J. Griffin,et al.  Bayesian adaptive lassos with non-convex penalization , 2007 .

[22]  Jeffrey S. Morris Functional Regression , 2014, 1406.4068.

[23]  Arnab Maity,et al.  Reduced Rank Mixed Effects Models for Spatially Correlated Hierarchical Functional Data , 2010, Journal of the American Statistical Association.

[24]  Ulysses Diva,et al.  Modelling spatially correlated survival data for individuals with multiple cancers , 2007, Statistical modelling.

[25]  Ana-Maria Staicu,et al.  Functional Additive Mixed Models , 2012, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[26]  Gerda Claeskens,et al.  Simultaneous Confidence Bands for Penalized Spline Estimators , 2009 .

[27]  Ying C MacNab,et al.  Regression B‐spline smoothing in Bayesian disease mapping: with an application to patient safety surveillance , 2007, Statistics in medicine.

[28]  Sudipto Guha,et al.  Near-optimal sparse fourier representations via sampling , 2002, STOC '02.

[29]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[30]  K. Baggerly,et al.  Understanding the development of human bladder cancer by using a whole-organ genomic mapping strategy , 2008, Laboratory Investigation.

[31]  Y. MacNab,et al.  Autoregressive Spatial Smoothing and Temporal Spline Smoothing for Mapping Rates , 2001, Biometrics.

[32]  M. Gerstein,et al.  Subcellular localization of the yeast proteome. , 2002, Genes & development.

[34]  Donald B. Percival,et al.  Wavelet shrinkage for unequally spaced data , 1999, Stat. Comput..

[35]  Jeffrey S. Morris,et al.  Bayesian Analysis of Mass Spectrometry Proteomic Data Using Wavelet‐Based Functional Mixed Models , 2008, Biometrics.

[36]  James S Hodges,et al.  Modeling Longitudinal Spatial Periodontal Data: A Spatially Adaptive Model with Tools for Specifying Priors and Checking Fit , 2008, Biometrics.

[37]  M. Wand,et al.  ON SEMIPARAMETRIC REGRESSION WITH O'SULLIVAN PENALIZED SPLINES , 2007 .

[38]  A. Poustka,et al.  Systematic subcellular localization of novel proteins identified by large‐scale cDNA sequencing , 2000, EMBO reports.

[39]  B. Ripley,et al.  Semiparametric Regression: Preface , 2003 .

[40]  J. Rice,et al.  Smoothing spline models for the analysis of nested and crossed samples of curves , 1998 .

[41]  B. Reich,et al.  A spatial beta-binomial model for clustered count data on dental caries , 2011, Statistical methods in medical research.

[42]  Volker J Schmid,et al.  Bayesian Extrapolation of Space–Time Trends in Cancer Registry Data , 2004, Biometrics.

[43]  B. Mallick,et al.  Bayesian Hierarchical Spatially Correlated Functional Data Analysis with Application to Colon Carcinogenesis , 2008, Biometrics.

[44]  Wensheng Guo Functional Mixed Effects Models , 2002 .

[45]  Bradley P Carlin,et al.  MODELING TEMPORAL GRADIENTS IN REGIONALLY AGGREGATED CALIFORNIA ASTHMA HOSPITALIZATION DATA. , 2013, The annals of applied statistics.

[46]  J. Griffin,et al.  Inference with normal-gamma prior distributions in regression problems , 2010 .

[47]  M A Martínez-Beneito,et al.  An autoregressive approach to spatio-temporal disease mapping. , 2008, Statistics in medicine.

[48]  Donald B. Percival,et al.  Asymptotic decorrelation of between-Scale Wavelet coefficients , 2005, IEEE Transactions on Information Theory.

[49]  John A. D. Aston,et al.  Linguistic pitch analysis using functional principal component mixed effect models , 2010 .

[50]  Marina Vannucci,et al.  Wavelet-Based Nonparametric Modeling of Hierarchical Functions in Colon Carcinogenesis , 2003 .