Massively Parallel Nonparametric Regression, With an Application to Developmental Brain Mapping

A penalized approach is proposed for performing large numbers of parallel nonparametric analyses of either of two types: restricted likelihood ratio tests of a parametric regression model versus a general smooth alternative, and nonparametric regression. Compared with naïvely performing each analysis in turn, our techniques reduce computation time dramatically. Viewing the large collection of scatterplot smooths produced by our methods as functional data, we develop a clustering approach to summarize and visualize these results. Our approach is applicable to ultra-high-dimensional data, particularly data acquired by neuroimaging; we illustrate it with an analysis of developmental trajectories of functional connectivity at each of approximately 70,000 brain locations. Supplementary materials, including an appendix and an R package, are available online.

[1]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[2]  F. O’Sullivan A Statistical Perspective on Ill-posed Inverse Problems , 1986 .

[3]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[4]  D. Louis Collins,et al.  Brain templates and atlases , 2012, NeuroImage.

[5]  S. Wood ON CONFIDENCE INTERVALS FOR GENERALIZED ADDITIVE MODELS BASED ON PENALIZED REGRESSION SPLINES , 2006 .

[6]  G. Wahba A Comparison of GCV and GML for Choosing the Smoothing Parameter in the Generalized Spline Smoothing Problem , 1985 .

[7]  N. Sugiura Further analysts of the data by akaike' s information criterion and the finite corrections , 1978 .

[8]  Ana-Maria Staicu,et al.  Fast methods for spatially correlated multilevel functional data. , 2010, Biostatistics.

[9]  Maurizio Corbetta,et al.  The human brain is intrinsically organized into dynamic, anticorrelated functional networks. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[10]  B. Silverman,et al.  Nonparametric regression and generalized linear models , 1994 .

[11]  B. Silverman,et al.  Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[12]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[13]  R. Deriche,et al.  Regularized, fast, and robust analytical Q‐ball imaging , 2007, Magnetic resonance in medicine.

[14]  Dinggang Shen,et al.  Multiscale adaptive regression models for neuroimaging data , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[15]  Gordana Derado,et al.  Modeling the Spatial and Temporal Dependence in fMRI Data , 2010, Biometrics.

[16]  R. Fisher 014: On the "Probable Error" of a Coefficient of Correlation Deduced from a Small Sample. , 1921 .

[17]  D. Louis Collins,et al.  Unbiased average age-appropriate atlases for pediatric studies , 2011, NeuroImage.

[18]  Anders M. Dale,et al.  When does brain aging accelerate? Dangers of quadratic fits in cross-sectional studies , 2010, NeuroImage.

[19]  S. Wood Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models , 2011 .

[20]  Philip T. Reiss,et al.  The International Journal of Biostatistics Fast Function-on-Scalar Regression with Penalized Basis Expansions , 2011 .

[21]  B. Biswal,et al.  The resting brain: unconstrained yet reliable. , 2009, Cerebral cortex.

[22]  Adrian Bowman,et al.  rpanel: Simple Interactive Controls for R Functions Using the tcltk Package , 2007 .

[23]  Y. Benjamini,et al.  Adaptive linear step-up procedures that control the false discovery rate , 2006 .

[24]  J. John Recovery of inter-block information , 1987 .

[25]  H. D. Patterson,et al.  Recovery of inter-block information when block sizes are unequal , 1971 .

[26]  James O. Ramsay,et al.  Functional Data Analysis , 2005 .

[27]  Alan Y. Chiang,et al.  Generalized Additive Models: An Introduction With R , 2007, Technometrics.

[28]  G. Kauermann,et al.  A Note on Penalized Spline Smoothing With Correlated Errors , 2007 .

[29]  D. Ruppert,et al.  Likelihood ratio tests in linear mixed models with one variance component , 2003 .

[30]  Alan C. Evans,et al.  Growing Together and Growing Apart: Regional and Sex Differences in the Lifespan Developmental Trajectories of Functional Homotopy , 2010, The Journal of Neuroscience.

[31]  G. Robinson That BLUP is a Good Thing: The Estimation of Random Effects , 1991 .

[32]  Thaddeus Tarpey,et al.  Clustering Functional Data , 2003, J. Classif..

[33]  G. Wahba Bayesian "Confidence Intervals" for the Cross-validated Smoothing Spline , 1983 .

[34]  John M. Chambers,et al.  Software for Data Analysis: Programming with R , 2008 .

[35]  B. Silverman,et al.  Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[36]  R. Deriche,et al.  Apparent diffusion coefficients from high angular resolution diffusion imaging: Estimation and applications , 2006, Magnetic resonance in medicine.

[37]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[38]  John M. Chambers,et al.  Software for data analysis , 2008 .

[39]  Emanuele Sella La vita della ricchezza , 1910 .

[40]  B. Silverman,et al.  Some Aspects of the Spline Smoothing Approach to Non‐Parametric Regression Curve Fitting , 1985 .

[41]  B. Biswal,et al.  Functional connectivity in the motor cortex of resting human brain using echo‐planar mri , 1995, Magnetic resonance in medicine.

[42]  B. Silverman,et al.  Smoothed functional principal components analysis by choice of norm , 1996 .

[43]  Alan C. Evans,et al.  Attention-deficit/hyperactivity disorder is characterized by a delay in cortical maturation , 2007, Proceedings of the National Academy of Sciences.

[44]  M. Wand,et al.  ON SEMIPARAMETRIC REGRESSION WITH O'SULLIVAN PENALIZED SPLINES , 2007 .

[45]  Sue J. Welham,et al.  Computational Statistics and Data Analysis a Note on Bimodality in the Log-likelihood Function for Penalized Spline Mixed Models , 2022 .

[46]  Eva Petkova,et al.  Optimal Partitioning for Linear Mixed Effects Models: Applications to Identifying Placebo Responders , 2010, Journal of the American Statistical Association.

[47]  Eleazar Eskin,et al.  Improved linear mixed models for genome-wide association studies , 2012, Nature Methods.

[48]  B. Ripley,et al.  Semiparametric Regression: Preface , 2003 .

[49]  Bernard D. Flury,et al.  Estimation of Principal Points , 1993 .

[50]  Paul H. C. Eilers,et al.  Flexible smoothing with B-splines and penalties , 1996 .

[51]  Scott Holland,et al.  Template-O-Matic: A toolbox for creating customized pediatric templates , 2008, NeuroImage.

[52]  Thomas E. Nichols,et al.  Thresholding of Statistical Maps in Functional Neuroimaging Using the False Discovery Rate , 2002, NeuroImage.

[53]  Terry Speed,et al.  [That BLUP is a Good Thing: The Estimation of Random Effects]: Comment , 1991 .

[54]  Ying Liu,et al.  FaST linear mixed models for genome-wide association studies , 2011, Nature Methods.

[55]  S. Wood Generalized Additive Models: An Introduction with R , 2006 .

[56]  R. Todd Ogden,et al.  Smoothing parameter selection for a class of semiparametric linear models , 2009 .