Reducing MAUP bias of correlation statistics between water quality and GI illness

Abstract This research investigates the role of spatial aggregation and the modifiable area unit problem (MAUP) on the correlation between drinking water quality and gastrointestinal (GI) illness. Using water quality estimates from hydraulic modeling of a water distribution system and a linear dose–response model, we simulate illness point patterns with a theoretically determined correlation to average pathogen concentrations. We then assess the sensitivity of the Pearson’s correlation statistic ( r ) to different aggregation units. Because public health data are often geocoded illness events, we assess the importance of their network-clustered structure by comparing two spatial scenarios. The first scenario uses a random spatial distribution for illness point patterns. Randomly located points are then compared to network-clustered illness event patterns where the set of possible illness locations is limited to network nodes. We then analyze multiple illness simulations with various sets of commonly used areal units, such as census units, regular grids, and Voronoi tessellations. A systematic bias on r due to the MAUP is estimated by showing an average reduction in r of 0.65. Consideration of the spatial network constraint of illness data during aggregation reduces this MAUP bias estimate 41% from 0.65 to 0.38.

[1]  Alan T. Murray,et al.  Excess Commuting and the Modifiable Areal Unit Problem , 2002 .

[2]  Atsuyuki Okabe,et al.  Spatial Tessellations: Concepts and Applications of Voronoi Diagrams , 1992, Wiley Series in Probability and Mathematical Statistics.

[3]  Harold David Reynolds,et al.  The modifiable area unit problem, empirical analysis by statistical simulation , 1998 .

[4]  Sander Greenland,et al.  An overview of relations among causal modelling methods. , 2002, International journal of epidemiology.

[5]  Ellen K Cromley,et al.  GIS and disease. , 2003, Annual review of public health.

[6]  Tomoki Nakaya,et al.  An Information Statistical Approach to the Modifiable Areal Unit Problem in Incidence Rate Maps , 2000 .

[7]  W. S. Robinson Ecological correlations and the behavior of individuals. , 1950, International journal of epidemiology.

[8]  Gary C. White,et al.  Statistical Applications in the Spatial Sciences. , 1981 .

[9]  Dong-Ho Shin Governing Interregional Conflicts: The Planning Approach to Managing Spillovers of Extended Metropolitan Pusan, Korea , 2000 .

[10]  R. Kolter,et al.  Biofilm formation as microbial development. , 2000, Annual review of microbiology.

[11]  Zola K. Moon,et al.  Population Density Surface: A New Approach to an Old Problem , 2001 .

[12]  Stan Openshaw,et al.  Modifiable Areal Unit Problem , 2008, Encyclopedia of GIS.

[13]  David W. S. Wong,et al.  Exploring the Variability of Segregation Index D with Scale and Zonal Systems: An Analysis of Thirty US Cities , 1999 .

[14]  Marios M. Polycarpou,et al.  Optimal Scheduling of Booster Disinfection in Water Distribution Systems , 1998 .

[15]  Carl Amrhein,et al.  The Effect of Data Aggregation on a Poisson Regression Model of Canadian Migration , 1992 .

[16]  R. Bhopal,et al.  Users' perspectives on epidemiological, GIS and point pattern approaches to analysing environment and health data. , 2004, Health & place.

[17]  Mark W. LeChevallier,et al.  Examining Relationships Between Particle Counts and Giardia, Cryptosporidium, and Turbidity , 1992 .

[18]  Michèle Prévost,et al.  A prospective epidemiological study of gastrointestinal health effects due to the consumption of drinking water , 1997 .

[19]  Robert B McMaster,et al.  Introduction: Scale and Geographic Inquiry , 2008 .

[20]  L. King,et al.  Statistical Analysis In Geography , 1969 .

[21]  D. Sui New Directions in Ecological Inference: An Introduction , 2000 .

[22]  Daniel A. Griffith,et al.  PRACTICAL HANDBOOK of Spatial Statistics , 1998 .

[23]  C. E. Gehlke,et al.  Certain Effects of Grouping upon the Size of the Correlation Coefficient in Census Tract Material , 1934 .

[24]  R. D. Morris,et al.  Temporal variation in drinking water turbidity and diagnosed gastroenteritis in Milwaukee. , 1996, American journal of public health.

[25]  W. Tobler Frame independent spatial analysis , 1989 .

[26]  Daniel A. Griffith,et al.  Econometric advances in spatial modelling and methodology : essays in honour of Jean Paelinck , 1998 .

[27]  Carl Amrhein,et al.  SOME EFFECTS OF SPATIAL AGGREGATION ON MULTIVARIATE REGRESSION PARAMETERS , 1998 .

[28]  Tracy D Cronin Technical Support Working Group , 2001 .

[29]  David G Steel,et al.  Rules for Random Aggregation , 1996 .

[30]  Carl Amrhein,et al.  Using spatial statistics to assess aggregation effects , 1996 .

[31]  P. Bhave Analysis of Flow in Water Distribution Networks , 1992 .

[32]  K. C. Clarke,et al.  On epidemiology and geographic information systems: a review and discussion of future directions. , 1996, Emerging infectious diseases.

[33]  Steven G. Buchberger,et al.  Assessing Intrusion Susceptibility in Distribution Systems , 2002 .

[34]  T. Ford,et al.  Deterioration of drinking water quality in the distribution system and gastrointestinal morbidity in a Russian city , 2002, International journal of environmental health research.

[35]  Melinda Friedman,et al.  The potential for health risks from intrusion of contaminants into the distribution system from pressure transients. , 2003, Journal of water and health.

[36]  Antonio Paez Anisotropic Variance Functions in Geographically Weighted Regression Models , 2004 .

[37]  M. Sinclair,et al.  Drinking water and endemic gastrointestinal illness , 2000, Journal of epidemiology and community health.

[38]  David J. Martin Extending the automated zoning procedure to reconcile incompatible zoning systems , 2003, Int. J. Geogr. Inf. Sci..

[39]  F. LeClere,et al.  Using aggregate geographic data to proxy individual socioeconomic status: does size matter? , 2001, American journal of public health.

[40]  Robert B McMaster,et al.  Scale and Geographic Inquiry , 2004 .

[41]  Rolf A. Deininger,et al.  Safeguarding The Security Of Public Water Supplies Using Early Warning Systems: A Brief Review , 2009 .

[42]  Jayajit Chakraborty,et al.  International Journal of Health Geographics Improving Environmental Exposure Analysis Using Cumulative Distribution Functions and Individual Geocoding , 2022 .

[43]  Sang-Il Lee,et al.  Developing a bivariate spatial association measure: An integration of Pearson's r and Moran's I , 2001, J. Geogr. Syst..

[44]  D. Freedman From association to causation: some remarks on the history of statistics , 1999 .

[45]  Alan T. Murray,et al.  The Influence of Data Aggregation on the Stability of p-Median Location Model Solutions , 2010 .

[46]  Chris Brunsdon,et al.  Geographically Weighted Regression: The Analysis of Spatially Varying Relationships , 2002 .

[47]  Noel A Cressie,et al.  Statistics for Spatial Data. , 1992 .

[48]  Leland Gerson Neuberg,et al.  A solution to the ecological inference problem: Reconstructing individual behavior from aggregate data , 1999 .

[49]  Atsuyuki Okabe,et al.  The Modifiable Areal Unit Problem in a Repression Model Whose Independent Variable Is a Distance from a Predetermined Point , 2002 .

[50]  M. Charlton,et al.  Quantitative geography : perspectives on spatial data analysis by , 2001 .

[51]  D. Griffith Spatial Autocorrelation and Spatial Filtering , 2003 .

[52]  O. Kratochvil [Medical geography]. , 1976, Zdravotnicka pracovnice.

[53]  Patricia L Meinhardt,et al.  Water and bioterrorism: preparing for the potential threat to U.S. water supplies and public health. , 2005, Annual review of public health.

[54]  John R. Nuckols,et al.  Using Geographic Information Systems for Exposure Assessment in Environmental Epidemiology Studies , 2004, Environmental health perspectives.

[55]  G. Logsdon,et al.  Hindsight Is 20/20: Using History to Avoid Waterborne Disease Outbreaks , 2004 .

[56]  Avi Ostfeld,et al.  Optimal Layout of Early Warning Detection Stations for Water Distribution Systems Security , 2004 .

[57]  N. Andersson,et al.  International Journal of Health Geographics Epidemiological Geomatics in Evaluation of Mine Risk Education in Afghanistan: Introducing Population Weighted Raster Maps , 2006 .

[58]  P. Boulos,et al.  Discrete Volume‐Element Method for Network Water‐Quality Models , 1993 .

[59]  C. Amrhein Searching for the Elusive Aggregation Effect: Evidence from Statistical Simulations , 1995 .

[60]  L. Anselin Spatial Econometrics: Methods and Models , 1988 .

[61]  David G Steel,et al.  Aggregation and Ecological Effects in Geographically Based Data , 2010 .

[62]  J Schwartz,et al.  Drinking water turbidity and gastrointestinal illness in the elderly of Philadelphia , 2000, Journal of epidemiology and community health.

[63]  Pascal Michel,et al.  Geographical and temporal distribution of human giardiasis in Ontario, Canada , 2003, International journal of health geographics.

[64]  D. Steel,et al.  Using Census Data to Investigate the Causes of the Ecological Fallacy , 1998, Environment & planning A.

[65]  J. Sunyer,et al.  Drinking water and gastrointestinal disease: need of better understanding and an improvement in public health surveillance , 2000, Journal of epidemiology and community health.

[66]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[67]  Darren M. Scott,et al.  Spatial statistics for urban analysis: A review of techniques with examples , 2004 .

[68]  W. Tobler A Computer Movie Simulating Urban Growth in the Detroit Region , 1970 .

[69]  D Hémon,et al.  Assessing the significance of the correlation between two spatial processes. , 1989, Biometrics.

[70]  S. McLafferty,et al.  GIS and Public Health , 2002 .

[71]  David W. S. Wong The Modifiable Areal Unit Problem (MAUP) , 2004 .

[72]  A S Fotheringham,et al.  The Modifiable Areal Unit Problem in Multivariate Statistical Analysis , 1991 .

[73]  S. Openshaw A million or so correlation coefficients : three experiments on the modifiable areal unit problem , 1979 .

[74]  J. Buring,et al.  Epidemiology in Medicine , 1987 .

[75]  Chyr Pyng Liou,et al.  Modeling the Propagation of Waterborne Substances in Distribution Networks , 1987 .

[76]  A. U. C. J. Van Beurden,et al.  Aggregation issues of spatial information in environmental research , 1999, Int. J. Geogr. Inf. Sci..