Spatial fay-herriot models for small area estimation with functional covariates

Abstract The Fay–Herriot (FH) model is widely used in small area estimation and uses auxiliary information to reduce estimation variance at undersampled locations. We extend the type of covariate information used in the FH model to include functional covariates, such as social-media search loads or remote-sensing images (e.g., in crop-yield surveys). The inclusion of these functional covariates is facilitated through a two-stage dimension-reduction approach that includes a Karhunen–Loeve expansion followed by stochastic search variable selection. Additionally, the importance of modeling spatial autocorrelation has recently been recognized in the FH model; our model utilizes the intrinsic conditional autoregressive class of spatial models in addition to functional covariates. We demonstrate the effectiveness of our approach through simulation and analysis of data from the American Community Survey. We use Google Trends searches over time as functional covariates to analyze relative changes in rates of percent household Spanish-speaking in the eastern half of the United States.

[1]  Henry W. Altland,et al.  Applied Functional Data Analysis , 2003, Technometrics.

[2]  Danny Pfeffermann,et al.  Small Area Estimation , 2011, International Encyclopedia of Statistical Science.

[3]  R. O’Hara,et al.  A review of Bayesian variable selection methods: what, how and which , 2009 .

[4]  J. Besag,et al.  Bayesian image restoration, with two applications in spatial statistics , 1991 .

[5]  N. Salvati,et al.  Small area estimation for spatial correlation in watershed erosion assessment , 2006 .

[6]  Michael W. Horrigan Big Data: A perspective from the BLS , 2013 .

[7]  Christian P. Robert,et al.  Statistics for Spatio-Temporal Data , 2014 .

[8]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[9]  Ana-Maria Staicu,et al.  Generalized Multilevel Functional Regression , 2009, Journal of the American Statistical Association.

[10]  Guangquan Li,et al.  Bayesian Statistics Small Area Estimation , 2010 .

[11]  Scott H. Holan,et al.  A Bayesian Approach to Estimating Agricultural Yield Based on Multiple Repeated Surveys , 2012 .

[12]  Christopher K. Wikle,et al.  Ecological Prediction With Nonlinear Multivariate Time-Frequency Functional Data Models , 2013 .

[13]  J. S. Rao,et al.  Best Predictive Small Area Estimation , 2011 .

[14]  Mahmoud Torabi,et al.  Hierarchical Bayes estimation of spatial statistics for rates , 2012 .

[15]  M. Pratesi,et al.  Small area estimation in the presence of correlated random area effects , 2009 .

[16]  Christopher K Wikle,et al.  Modeling Complex Phenotypes: Generalized Linear Models Using Spectrogram Predictors of Animal Communication Signals , 2010, Biometrics.

[17]  Ciprian M Crainiceanu,et al.  Penalized Functional Regression , 2011, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[18]  Ciprian M Crainiceanu,et al.  Longitudinal penalized functional regression for cognitive outcomes on neuronal tract measurements , 2012, Journal of the Royal Statistical Society. Series C, Applied statistics.

[19]  H. Muller,et al.  Generalized functional linear models , 2005, math/0505638.

[20]  Jane-ling Wang,et al.  Functional linear regression analysis for longitudinal data , 2005, math/0603132.

[21]  John G. Proakis,et al.  Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..

[22]  D. Matteson,et al.  An approach for identifying and predicting economic recessions in real-time using time–frequency functional models , 2012 .

[23]  Monica Pratesi,et al.  Bootstrap for estimating the MSE of the Spatial EBLUP , 2007, Comput. Stat..

[24]  Tommy Wright,et al.  Roward a vision: official statistics and big data , 2013 .

[25]  Alberto Maria Segre,et al.  The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic , 2011, PloS one.

[26]  Marina Vannucci,et al.  Bayesian Models for Variable Selection that Incorporate Biological Information , 2012 .

[27]  Ying C MacNab,et al.  Hierarchical Bayesian spatial modelling of small‐area rates of non‐rare disease , 2003, Statistics in medicine.

[28]  Norman E. Breslow,et al.  Estimation of Disease Rates in Small Areas: A new Mixed Model for Spatial Dependence , 2000 .

[29]  Gareth M. James Generalized linear models with functional predictors , 2002 .

[30]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[31]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[32]  B. Mallick,et al.  Bayesian Hierarchical Spatially Correlated Functional Data Analysis with Application to Colon Carcinogenesis , 2008, Biometrics.

[33]  R. Fay,et al.  Estimates of Income for Small Places: An Application of James-Stein Procedures to Census Data , 1979 .

[34]  N. Cressie,et al.  Empirical Hierarchical Modelling for Count Data using the Spatial Random Effects Model , 2013 .

[35]  E. George,et al.  APPROACHES FOR BAYESIAN VARIABLE SELECTION , 1997 .

[36]  G. K. Shukla,et al.  Spatio-Temporal Models in Small Area Estimation , 2005 .

[37]  E. George The Variable Selection Problem , 2000 .

[38]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .