Hierarchical multiresolution approaches for dense point-level breast cancer treatment data

The analysis of point-level (geostatistical) data has historically been plagued by computational difficulties, owing to the high dimension of the nondiagonal spatial covariance matrices that need to be inverted. This problem is greatly compounded in hierarchical Bayesian settings, since these inversions need to take place at every iteration of the associated Markov chain Monte Carlo (MCMC) algorithm. This paper offers an approach for modeling the spatial correlation at two separate scales. This reduces the computational problem to a collection of lower-dimensional inversions that remain feasible within the MCMC framework. The approach yields full posterior inference for the model parameters of interest, as well as the fitted spatial response surface itself. We illustrate the importance and applicability of our methods using a collection of dense point-referenced breast cancer data collected over the mostly rural northern part of the state of Minnesota. Substantively, we wish to discover whether women who live more than a 60-mile drive from the nearest radiation treatment facility tend to opt for mastectomy over breast conserving surgery (BCS, or "lumpectomy"), which is less disfiguring but requires 6 weeks of follow-up radiation therapy. Our hierarchical multiresolution approach resolves this question while still properly accounting for all sources of spatial association in the data.

[1]  C. Wikle Spatial Modelling of Count Data: A Case Study in Modelling Breeding Bird Survey Data on Large Spatial Domains , 2002 .

[2]  M. Fuentes Approximate Likelihood for Large Irregularly Spaced Spatial Data , 2007, Journal of the American Statistical Association.

[3]  C. Robert,et al.  Deviance information criteria for missing data models , 2006 .

[4]  S. MacEachern,et al.  Bayesian Nonparametric Spatial Modeling With Dirichlet Process Mixing , 2005 .

[5]  D. Higdon Space and Space-Time Modeling using Process Convolutions , 2002 .

[6]  T. Louis,et al.  Bayes and Empirical Bayes Methods for Data Analysis. , 1997 .

[7]  Ronald P. Barry,et al.  Flexible Spatial Models for Kriging and Cokriging Using Moving Averages and the Fast Fourier Transform (FFT) , 2004 .

[8]  Ronald P. Barry,et al.  Constructing and fitting models for cokriging and multivariable spatial prediction , 1998 .

[9]  J. Chilès,et al.  Geostatistics: Modeling Spatial Uncertainty , 1999 .

[10]  S. Kotz,et al.  Symmetric Multivariate and Related Distributions , 1989 .

[11]  M. Wand,et al.  General design Bayesian generalized linear mixed models , 2006, math/0606491.

[12]  Andrew O. Finley,et al.  A Bayesian approach to multi-source forest area estimation , 2008, Environmental and Ecological Statistics.

[13]  Andrew B. Lawson,et al.  Spatial cluster modelling , 2002 .

[14]  A. Brix Bayesian Data Analysis, 2nd edn , 2005 .

[15]  Mark F. J. Steel,et al.  Non-Gaussian Bayesian Geostatistical Modeling , 2006 .

[16]  Andrew O. Finley,et al.  Bayesian multi-resolution modeling for spatially replicated data sets with application to forest biomass data , 2007 .

[17]  Thomas H. Cormen,et al.  Introduction to algorithms [2nd ed.] , 2001 .

[18]  Bradley P. Carlin,et al.  BAYES AND EMPIRICAL BAYES METHODS FOR DATA ANALYSIS , 1996, Stat. Comput..

[19]  J. Møller,et al.  Statistical Inference and Simulation for Spatial Point Processes , 2003 .

[20]  Christopher J. Paciorek,et al.  Computational techniques for spatial logistic regression with large data sets , 2007, Comput. Stat. Data Anal..

[21]  Sw. Banerjee,et al.  Hierarchical Modeling and Analysis for Spatial Data , 2003 .

[22]  Jim Law,et al.  Review of "The boost graph library: user guide and reference manual by Jeremy G. Siek, Lie-Quan Lee, and Andrew Lumsdaine." Addison-Wesley 2002. , 2003, SOEN.

[23]  C. Anderson,et al.  Quantitative Methods for Current Environmental Issues , 2005 .

[24]  Robert Haining,et al.  Statistics for spatial data: by Noel Cressie, 1991, John Wiley & Sons, New York, 900 p., ISBN 0-471-84336-9, US $89.95 , 1993 .

[25]  Roger Woodard,et al.  Interpolation of Spatial Data: Some Theory for Kriging , 1999, Technometrics.

[26]  Chris Chatfield,et al.  Statistical Methods for Spatial Data Analysis , 2004 .

[27]  P. Whittle ON STATIONARY PROCESSES IN THE PLANE , 1954 .

[28]  Jessica Gurevitch,et al.  Design and Analysis of Ecological Experiments , 1993 .

[29]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[30]  P. Diggle Applied Spatial Statistics for Public Health Data , 2005 .

[31]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[32]  R. Reese Geostatistics for Environmental Scientists , 2001 .

[33]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[34]  Jarvis T. Chen,et al.  Geocoding and monitoring of US socioeconomic inequalities in mortality and cancer incidence: does the choice of area-based measure and geographic level matter?: the Public Health Disparities Geocoding Project. , 2002, American journal of epidemiology.

[35]  Andrew B. Lawson,et al.  Statistical Methods in Spatial Epidemiology , 2001 .

[36]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[37]  M. Fuentes Spectral methods for nonstationary spatial processes , 2002 .