GIS and Statistical Groundwater Vulnerability Modeling

ArcGIS Desktop* and ArcGIS Workstation* were used to extract statistically significant information from various geospatial data sets for input into a statistical model of groundwater vulnerability. The product of this effort was a probability map that identified areas of vulnerability to groundwater-quality degradation. This information is of interest to a variety of water professionals because it provides a tool to help make educated decisions regarding the management of groundwater resources in the High Plains aquifer. Because the study area includes 174,000 mi2 in parts of eight States, a Geographic Information System (GIS) was required to efficiently extract information for 31 variables from 14 spatial-data layers at 6,416 well locations throughout the study area. The layers were both vector and raster and included information about depth to water, aquifer saturated thickness, aquifer hydraulic conductivity, aquifer specific yield, average annual precipitation rates, percentage of irrigated land and agricultural land (irrigated and nonirrigated), chemical application rates (nitrogen/phosphorus/pesticide), manure application rates, soil characteristics, and water-use estimates. For categorical data sets and certain continuous data sets (precipitation, depth to water, saturated thickness, hydraulic conductivity, specific yield, chemical applications, manure applications, and water use) the data were extracted directly from the layer at the location of each well by using a series of identity overlays. For other layers where information needed to be related to the area around a well (soil characteristics, percentage of irrigated land around a well, percentage of agricultural land around a well), buffers of varying sizes were created around each well and the information was inventoried for the buffer areas using both vector union techniques and raster map-algebra techniques. The extracted data were used as variable input for an iterative series of statistical calculations using logistic regression. These calculations determined which of the variables (layers) or combination of variables were significantly correlated with observed nitrate concentrations in the groundwater. The variables became part of an equation defining the probability of a dissolved constituent in the ground water to be above a specified concentration. Once the probability equation was defined, the appropriate GIS layers representing the significant variables were converted to raster data sets (if they were vector data sets; raster data sets were used as is) to utilize the map algebra capabilities of ArcGIS. The significant raster data layers, the equation, and its various coefficients for each layer were put back into the GIS and, using map algebra, the probability surface was calculated and then easily visualized using the GIS. *The use of trade names are used for descriptive purposes only and should not be considered an endorsement of a specific product by the U.S. Government.