Estimation and Inference for Generalized Geoadditive Models

Abstract In many application areas, data are collected on a count or binary response with spatial covariate information. In this article, we introduce a new class of generalized geoadditive models (GGAMs) for spatial data distributed over complex domains. Through a link function, the proposed GGAM assumes that the mean of the discrete response variable depends on additive univariate functions of explanatory variables and a bivariate function to adjust for the spatial effect. We propose a two-stage approach for estimating and making inferences of the components in the GGAM. In the first stage, the univariate components and the geographical component in the model are approximated via univariate polynomial splines and bivariate penalized splines over triangulation, respectively. In the second stage, local polynomial smoothing is applied to the cleaned univariate data to average out the variation of the first-stage estimators. We investigate the consistency of the proposed estimators and the asymptotic normality of the univariate components. We also establish the simultaneous confidence band for each of the univariate components. The performance of the proposed method is evaluated by two simulation studies. We apply the proposed method to analyze the crash counts data in the Tampa-St. Petersburg urbanized area in Florida. Supplementary materials for this article are available online.

[1]  David L. Miller,et al.  Finite area smoothing with generalized distance splines , 2014, Environmental and Ecological Statistics.

[2]  Jianqing Fan,et al.  Local polynomial kernel regression for generalized linear models and quasi-likelihood functions , 1995 .

[3]  H. Rue,et al.  An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach , 2011 .

[4]  J P Leigh,et al.  Unemployment and highway fatalities. , 1991, Journal of health politics, policy and law.

[5]  A. Qu,et al.  Estimation and model selection in generalized additive partial linear models for correlated data with diverging number of covariates , 2014, 1405.6030.

[6]  Haonan Wang,et al.  Low‐Rank Smoothing Splines on Complicated Domains , 2007, Biometrics.

[7]  Peter Craven,et al.  Smoothing noisy data with spline functions , 1978 .

[8]  Wei Li,et al.  Multivariate Poisson-lognormal model for analysis of crashes on urban signalized intersections approach , 2017 .

[9]  Manuel Wiesenfarth,et al.  Direct Simultaneous Inference in Additive Models and Its Application to Model Undernutrition , 2012 .

[10]  Anuj Sharma,et al.  Exploring spatio-temporal effects in traffic crash trend analysis , 2017 .

[11]  James O. Ramsay,et al.  Spatial spline regression models , 2013 .

[12]  A. Wagenaar UNEMPLOYMENT AND MOTOR VEHICLE ACCIDENTS IN MICHIGAN , 1983 .

[13]  C. J. Stone,et al.  The Dimensionality Reduction Principle for Generalized Additive Models , 1986 .

[14]  Paul H. C. Eilers,et al.  Multidimensional Penalized Signal Regression , 2005, Technometrics.

[15]  Lung-fei Lee,et al.  ASYMPTOTIC DISTRIBUTIONS OF QUASI-MAXIMUM LIKELIHOOD , 2004 .

[16]  Gerda Claeskens,et al.  Nonparametric Estimation , 2011, International Encyclopedia of Statistical Science.

[17]  Ming-Jun Lai,et al.  Bivariate Penalized Splines for Regression , 2013 .

[18]  Darrian Collins,et al.  Gender and Differences in Travel Life Cycles , 2002 .

[19]  Lan Xue,et al.  ADDITIVE COEFFICIENT MODELING VIA POLYNOMIAL SPLINE , 2005 .

[20]  Jianqing Fan,et al.  Local polynomial modelling and its applications , 1994 .

[21]  Anastasia Loukaitou-Sideris,et al.  Death on the Crosswalk , 2007 .

[22]  Bani K. Mallick,et al.  ROADWAY TRAFFIC CRASH MAPPING: A SPACE-TIME MODELING APPROACH , 2003 .

[23]  Lung-fei Lee,et al.  Asymptotic Distributions of Quasi-Maximum Likelihood Estimators for Spatial Autoregressive Models , 2004 .

[24]  Marta Blangiardo,et al.  A space–time multivariate Bayesian model to analyse road traffic accidents by severity , 2017 .

[25]  Lijian Yang,et al.  Spline-backfitted kernel smoothing of nonlinear additive autoregression model , 2006, math/0612677.

[26]  Jun Zhu,et al.  Web-based Supplementary Materials for "On selection of spatial linear models for lattice data" , 2009 .

[27]  Wolfgang Härdle,et al.  Oracally Efficient Two-Step Estimation of Generalized Additive Model , 2011 .

[28]  Enno Mammen,et al.  Smooth backfitting in generalized additive models , 2008, 0803.1922.

[29]  Wolfgang Härdle,et al.  Statistical inference for generalized additive models: simultaneous confidence corridors and variable selection , 2016 .

[30]  Grace Wahba,et al.  Spline Models for Observational Data , 1990 .

[31]  Hua Liang,et al.  ESTIMATION AND VARIABLE SELECTION FOR GENERALIZED ADDITIVE PARTIAL LINEAR MODELS. , 2011, Annals of statistics.

[32]  R. Tibshirani,et al.  Generalized Additive Models: Some Applications , 1987 .

[33]  M. Wand,et al.  Geoadditive models , 2003 .

[34]  S. Wood Thin plate regression splines , 2003 .

[35]  Wolfgang Karl Härdle,et al.  Derivative Estimation and Testing in Generalized Additive Models , 2003 .

[36]  K. Kockelman,et al.  Bayesian Multivariate Poisson Regression for Models of Injury Count , by Severity , 2005 .

[37]  D. Shinar,et al.  Aggressive driving: an observational study of driver, vehicle, and situational variables. , 2004, Accident; analysis and prevention.

[38]  Jing Wang,et al.  POLYNOMIAL SPLINE CONFIDENCE BANDS FOR REGRESSION CURVES , 2009 .

[39]  Tim Ramsay,et al.  Spline smoothing over difficult regions , 2002 .

[40]  Hua Liang,et al.  Polynomial Spline Estimation for a Generalized Additive Coefficient Model , 2010, Scandinavian journal of statistics, theory and applications.

[41]  Joel L. Horowitz,et al.  Nonparametric estimation of an additive model with a link function , 2002, math/0508595.

[42]  Gerda Claeskens,et al.  Simultaneous Confidence Bands for Penalized Spline Estimators , 2009 .

[43]  S. Lowenstein,et al.  Motor vehicle crash fatalities: A comparison of Hispanic and non-Hispanic motorists in Colorado. , 2000, Annals of emergency medicine.

[44]  David Ruppert,et al.  Additive Partial Linear Models with Measurement Errors. , 2008, Biometrika.

[45]  Anuj Sharma,et al.  Using the multivariate spatio-temporal Bayesian model to analyze traffic crashes by severity , 2018 .

[46]  Satish V. Ukkusuri,et al.  Random Parameter Model Used to Explain Effects of Built-Environment Characteristics on Pedestrian Crash Frequency , 2011 .

[47]  Chandra R. Bhat,et al.  Unobserved heterogeneity and the statistical analysis of highway accident data , 2016 .

[48]  Anastasia Loukaitou-Sideris,et al.  Death on the Crosswalk: A Study of Pedestrian-Automobile Collisions in Los Angeles , 2005 .

[49]  B. Mallick,et al.  Bayesian multivariate spatial models for roadway traffic crash mapping , 2006 .

[50]  Ian W. McKeague,et al.  Width-scaled confidence bands for survival functions , 2006 .

[51]  Simon N. Wood,et al.  Soap film smoothing , 2008 .

[52]  Panagiotis Ch. Anastasopoulos Random parameters multivariate tobit and zero-inflated count data models: addressing unobserved and zero-state heterogeneity in accident injury-severity rate and frequency analysis , 2016 .

[53]  Kara M. Kockelman,et al.  Bayesian Multivariate Poisson Regression for Models of Injury Count, by Severity , 2006 .

[54]  Ming-Jun Lai,et al.  >Efficient Estimation of Partially Linear Models for Data on Complicated Domains by Bivariate Penalized Splines over Triangulations , 2020 .