Using Regression Tree Analysis to Improve Predictions of Low-Flow Nitrate and Chloride in Willamette River Basin Watersheds

The use of regression tree analysis is examined as a tool to evaluate hydrologic and land use factors that affect nitrate and chloride stream concentrations during low-flow conditions. Although this data mining technique has been used to assess a range of ecological parameters, it has not previously been used for stream water quality analysis. Regression tree analysis was conducted on nitrate and chloride data from 71 watersheds in the Willamette River Basin to determine whether this method provides a greater predictive ability compared to standard multiple linear regression, and to elucidate the potential roles of controlling mechanisms. Metrics used in the models included a variety of watershed-scale landscape indices and land use classifications. Regression tree analysis significantly enhanced model accuracy over multiple linear regression, increasing nitrate R2 values from 0.38 to 0.75 and chloride R2 values from 0.64 to 0.85 and as indicated by the ΔAIC value. These improvements are primarily attributed to the ability for regression trees to more effectively handle interactions and manage non-linear functions associated with watershed heterogeneity within the basin. Whereas hydrologic factors governed the conservative chloride tracer in the model, land use dominated control of nitrate concentrations. Watersheds containing higher agricultural activity did not necessarily yield high nitrate concentrations, but agricultural areas combined with either small proportions of forested land or greater urbanization generated nitrate levels far exceeding water quality standards. Although further refinements are recommended, we conclude that regression tree analysis presents water resource managers a promising tool that improves on the predictive ability of standard statistical methods, provides insight into controlling mechanisms, and helps identify catchment characteristics associated with water quality impairment.

[1]  J. Scott,et al.  Physical Factors Control Phytoplankton Production and Nitrogen Fixation in Eight Texas Reservoirs , 2008, Ecosystems.

[2]  K. Devito,et al.  Spatial heterogeneity in trophic status of shallow lakes on the Boreal Plain: Influence of hydrologic setting , 2008 .

[3]  Craig A Stow,et al.  Bayesian methods for regional-scale eutrophication models. , 2004, Water research.

[4]  K. Beven,et al.  A physically based, variable contributing area model of basin hydrology , 1979 .

[5]  B. Arheimer,et al.  Variation of nitrogen concentration in forest streams — influences of flow, seasonality and catchment characteristics , 1996 .

[6]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[7]  J. McDonnell,et al.  Testing the Hydrological Landscape Unit Classification System and Other Terrain Analysis Measures for Predicting Low-Flow Nitrate and Chloride in Watersheds , 2008, Environmental management.

[8]  K. Beven,et al.  Nonparametric direct mapping of rainfall‐runoff relationships: An alternative approach to data analysis and modeling? , 2004 .

[9]  V. Brady,et al.  Relationship of stream flow regime in the western Lake Superior basin to watershed type characteristics , 2005 .

[10]  Raymond Torres,et al.  Geomorphic analysis of tidal creek networks , 2004 .

[11]  T. Fisher,et al.  The effects of forest on stream water quality in two coastal plain watersheds of the Chesapeake bay. , 2000 .

[12]  Brian J. Palik,et al.  USING AN ECOLOGICAL LAND HIERARCHY TO PREDICT SEASONAL-WETLAND ABUNDANCE IN UPLAND FORESTS , 2003 .

[13]  P. E. O'connell,et al.  IAHS Decade on Predictions in Ungauged Basins (PUB), 2003–2012: Shaping an exciting future for the hydrological sciences , 2003 .

[14]  M. Futter,et al.  A classification and regression tree model of controls on dissolved inorganic nitrogen leaching from European forests. , 2008, Environmental pollution.

[15]  Carl Richards,et al.  Landscape influences on water chemistry in Midwestern stream ecosystems , 1997 .

[16]  N. McKenzie,et al.  Spatial prediction of soil properties using environmental correlation , 1999 .

[17]  Lawrence E. Band,et al.  Characterizing the Spatial Pattern of Soil Carbon and Nitrogen Pools in the Turkey Lakes Watershed: A Comparison of Regression Techniques , 2002 .

[18]  Dennis M Heisey,et al.  A Regional Classification Scheme for Estimating Reference Water Quality in Streams Using Land-Use-Adjusted Spatial Regression-Tree Analysis , 2006, Environmental management.

[19]  Andreas Buja,et al.  Data mining criteria for tree-based regression and classification , 2001, KDD '01.

[20]  Thomas C Winter,et al.  Delineation and Evaluation of Hydrologic-Landscape Regions in the United States Using Geographic Information System Tools and Multivariate Statistical Analyses , 2004, Environmental management.

[21]  S. Larned,et al.  Nitrogen Export from Forested Watersheds in the Oregon Coast Range: The Role of N2-fixing Red Alder , 2003, Ecosystems.

[22]  A. Hershey,et al.  Effect of landscape factors on fish distribution in arctic Alaskan lakes , 2006 .

[23]  W. Loh,et al.  REGRESSION TREES WITH UNBIASED VARIABLE SELECTION AND INTERACTION DETECTION , 2002 .

[24]  John L. Stoddard,et al.  The Relationship Between Stream Chemistry and Watershed Land Cover Data in the Mid-Atlantic Region, U.S. , 1998 .

[25]  W. Cohen,et al.  Land cover mapping in an agricultural setting using multiseasonal Thematic Mapper data , 2001 .

[26]  G. De’ath,et al.  CLASSIFICATION AND REGRESSION TREES: A POWERFUL YET SIMPLE TECHNIQUE FOR ECOLOGICAL DATA ANALYSIS , 2000 .

[27]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[28]  Jerome H. Friedman,et al.  A Recursive Partitioning Decision Rule for Nonparametric Classification , 1977, IEEE Transactions on Computers.

[29]  M. Mikoš,et al.  Assessment of hydrological and seasonal controls over the nitrate flushing from a forested watershed using a data mining technique , 2007 .

[30]  C. Stow,et al.  Nutrient criteria for lakes, ponds, and reservoirs: a Bayesian TREED model approach. , 2009 .

[31]  Nicholas C. Collins,et al.  TREE REGRESSION ANALYSIS ON THE NESTING HABITAT OF SMALLMOUTH BASS , 1999 .