varycoef: An R Package for Gaussian Process-based Spatially Varying Coefficient Models

Gaussian processes (GPs) are well-known tools for modeling dependent data with applications in spatial statistics, time series analysis, or econometrics. In this article, we present the R package varycoef that implements estimation, prediction, and variable selection of linear models with spatially varying coefficients (SVC) defined by GPs, so called GP-based SVC models. Such models offer a high degree of flexibility while being relatively easy to interpret. Using varycoef, we show versatile applications of (spatially) varying coefficient models on spatial and time series data. This includes model and coefficient estimation with predictions and variable selection. The package uses state-of-the-art computational statistics techniques like parallelization, model-based optimization, and covariance tapering. This allows the user to work with (S)VC models in a computationally efficient manner, i.e., model estimation on large data sets is possible in a feasible amount of time.

[1]  Zoubin Ghahramani,et al.  Gaussian Process Volatility Model , 2014, NIPS.

[2]  F. Vaida,et al.  Conditional Akaike information for mixed-effects models , 2005 .

[3]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[4]  Fabio Sigrist,et al.  Joint Variable Selection of both Fixed and Random Effects for Gaussian Process-based Spatially Varying Coefficient Models , 2021 .

[5]  S Roberts,et al.  Gaussian processes for time-series modelling , 2013, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[6]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[7]  Barry W. Peyton,et al.  Block sparse Cholesky algorithms on advanced uniprocessor computers , 1991 .

[8]  Andrew O. Finley,et al.  Bayesian spatially varying coefficient models in the spBayes R package , 2019, Environ. Model. Softw..

[9]  Sudipto Banerjee,et al.  Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets , 2014, Journal of the American Statistical Association.

[10]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[11]  Jorge Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[12]  A. Gelfand,et al.  Gaussian predictive process models for large spatial data sets , 2008, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[13]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[14]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[15]  Stephan R. Sain,et al.  spam: A Sparse Matrix R Package with Emphasis on MCMC Methods for Gaussian Markov Random Fields , 2010 .

[16]  H. Bondell,et al.  Joint Variable Selection for Fixed and Random Effects in Linear Mixed‐Effects Models , 2010, Biometrics.

[17]  Alan Y. Chiang,et al.  Generalized Additive Models: An Introduction With R , 2007, Technometrics.

[18]  D. Nychka,et al.  Covariance Tapering for Interpolation of Large Spatial Datasets , 2006 .

[19]  Reinhard Furrer,et al.  optimParallel: An R Package Providing a Parallel Version of the L-BFGS-B Optimization Method , 2019, R J..

[20]  Haavard Rue,et al.  Bayesian Computing with INLA: A Review , 2016, 1604.00860.

[21]  Bernd Bischl,et al.  Tuning and evolution of support vector kernels , 2012, Evol. Intell..

[22]  C. F. Sirmans,et al.  Spatial Modeling With Spatially Varying Coefficient Processes , 2003 .

[23]  Andrew O. Finley,et al.  spBayes for Large Univariate and Multivariate Point-Referenced Spatio-Temporal Data Models , 2013, 1310.8192.

[24]  Eric R. Ziegel,et al.  Geographically Weighted Regression , 2006, Technometrics.

[25]  Alessio Pollice,et al.  Discussing the “big n problem” , 2013, Stat. Methods Appl..

[26]  Bernd Bischl,et al.  Multi-objective parameter configuration of machine learning algorithms using model-based optimization , 2016, 2016 IEEE Symposium Series on Computational Intelligence (SSCI).

[27]  J. Ibrahim,et al.  Fixed and Random Effects Selection in Mixed Effects Models , 2011, Biometrics.

[28]  Alan E. Gelfand,et al.  Spatial statistics and Gaussian processes: A beautiful marriage , 2016 .

[29]  Bernd Bischl,et al.  mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions , 2017, 1703.03373.

[30]  Donald R. Jones,et al.  A Taxonomy of Global Optimization Methods Based on Response Surfaces , 2001, J. Glob. Optim..

[31]  Finn Lindgren,et al.  Bayesian Spatial Modelling with R-INLA , 2015 .

[32]  H. Rue,et al.  An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach , 2011 .

[33]  Dorit Hammerling,et al.  A Case Study Competition Among Methods for Analyzing Large Spatial Data , 2017, Journal of Agricultural, Biological and Environmental Statistics.

[34]  Jakob A. Dambon,et al.  Examining the vintage effect in hedonic pricing using spatially varying coefficients models: a case study of single-family houses in the Canton of Zurich , 2022, Swiss Journal of Economics and Statistics.

[35]  Martin Charlton,et al.  GWmodel: An R Package for Exploring Spatial Heterogeneity Using Geographically Weighted Models , 2013, 1306.0413.

[36]  George Athanasopoulos,et al.  Forecasting: principles and practice , 2013 .

[37]  S. Müller,et al.  Model Selection in Linear Mixed Models , 2013, 1306.2427.

[38]  Sergio J. Rey,et al.  PySAL: A Python Library of Spatial Analytical Methods , 2010 .

[39]  Bernd Bischl,et al.  mlr: Machine Learning in R , 2016, J. Mach. Learn. Res..

[40]  Martin Schlather,et al.  Analysis, Simulation and Prediction of Multivariate Random Fields with Package RandomFields , 2015 .

[41]  Stanley R. Johnson,et al.  Varying Coefficient Models , 1984 .

[42]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[43]  Huidong Jin,et al.  Hierarchical spatially varying coefficient and temporal dynamic process models using spTDyn , 2016 .