Clustering flood events from water quality time series using Latent Dirichlet Allocation model

To improve hydro-chemical modeling and forecasting, there is a need to better understand flood-induced variability in water chemistry and the processes controlling it in watersheds. In the literature, assumptions are often made, for instance, that stream chemistry reacts differently to rainfall events depending on the season; however, methods to verify such assumptions are not well developed. Often, few floods are studied at a time and chemicals are used as tracers. Grouping similar events from large multivariate datasets using principal component analysis and clustering methods helps to explain hydrological processes; however, these methods currently have some limits (definition of flood descriptors, linear assumption, for instance). Most clustering methods have been used in the context of regionalization, focusing more on mapping results than on understanding processes. In this study, we extracted flood patterns using the probabilistic Latent Dirichlet Allocation (LDA) model, its first use in hydrology, to our knowledge. The LDA method allows multivariate temporal datasets to be considered without having to define explanatory factors beforehand or select representative floods. We analyzed a multivariate dataset from a long-term observatory (Kervidy-Naizin, western France) containing data for four solutes monitored daily for 12 years: nitrate, chloride, dissolved organic carbon, and sulfate. The LDA method extracted four different patterns that were distributed by season. Each pattern can be explained by seasonal hydrological processes. Hydro-meteorological parameters help explain the processes leading to these patterns, which increases understanding of flood-induced variability in water quality. Thus, the LDA method appears useful for analyzing long-term datasets.

[1]  C. Gascuel-Odoux,et al.  Annual hysteresis of water quality: A method to analyse the effect of intra- and inter-annual climatic conditions , 2013 .

[2]  Chantal Gascuel-Odoux,et al.  Modelling flow and nitrate transport in groundwater for the prediction of water travel times and of consequences of land use evolution on water quality , 2002 .

[3]  P Barbieri,et al.  Comparison of self-organizing maps classification approach with cluster and principal components analysis for large environmental data sets. , 2007, Water research.

[4]  T. E. Unny,et al.  Stochastic synthesis of hydrologic data based on concepts of pattern recognition: I. General methodology of the approach , 1980 .

[5]  Houghton,et al.  A Special Report of Working Groups I and III of the Intergovernmental Panel on Climate Change , 2014 .

[6]  W. Gburek,et al.  Hydrologic Controls in Nitrate, Sulfate, and Chloride Concentrations , 1993 .

[7]  A. M. Kalteh,et al.  Review of the self-organizing map (SOM) approach in water resources: Analysis, modelling and application , 2008, Environ. Model. Softw..

[8]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[9]  G. Williams Sediment concentration versus water discharge during single hydrologic events in rivers , 1989 .

[10]  Nirmala Murthy,et al.  Summary for Policymakers , 2007 .

[11]  Doerthe Tetzlaff,et al.  Influence of hydrology and seasonality on DOC exports from three contrasting upland catchments , 2008 .

[12]  E. Toth Classification of hydro-meteorological conditions and multiple artificial neural networks for streamflow forecasting , 2009 .

[13]  M. Seeger,et al.  Catchment soil moisture and rainfall characteristics as determinant factors for discharge/suspended sediment hysteretic loops in a small headwater catchment in the Spanish pyrenees , 2004 .

[14]  Rammohan K. Ragade,et al.  A feature prediction model in synthetic hydrology based on concepts of pattern recognition , 1978 .

[15]  Mesure et analyse de la dynamique temporelle des flux solides dans les petits bassins versants , 2012 .

[16]  Heung Wong,et al.  Application of interval clustering approach to water quality evaluation , 2013 .

[17]  T. E. Unny,et al.  Stochastic synthesis of hydrologic data based on concepts of pattern recognition: II. Application of natural watersheds , 1980 .

[18]  David M. Hannah,et al.  Classification of river regimes: a context for hydroecology. , 2000 .

[19]  K. Bencala,et al.  Hydrological controls on dissolved organic carbon during snowmelt in the Snake River near Montezuma, Colorado , 1994 .

[20]  Lawrence E. Band,et al.  Regulation of Nitrate‐N Release from Temperate Forests: A Test of the N Flushing Hypothesis , 1996 .

[21]  J. Orwin,et al.  Short‐term spatial and temporal patterns of suspended sediment transfer in proglacial channels, small River Glacier, Canada , 2004 .

[22]  Chantal Gascuel-Odoux,et al.  Solute transport dynamics in small, shallow groundwater-dominated agricultural catchments: insights from a high-frequency, multisolute 10 yr-long monitoring study , 2013 .

[23]  M. Lees,et al.  Identification of processes affecting stream chloride response in the Hafren catchment, mid-Wales , 2002 .

[24]  David M. Hannah,et al.  An approach to hydrograph classification , 2000 .

[25]  Ashish Sharma,et al.  Seasonal to interannual rainfall probabilistic forecasts for improved water supply management: Part 1 — A strategy for system predictor identification , 2000 .

[26]  A. Castelletti,et al.  Tree‐based iterative input variable selection for hydrological modeling , 2013 .

[27]  P. Durand,et al.  Solute transfer in agricultural catchments: the interest and limits of mixing models , 1996 .

[28]  P. Durand,et al.  Sources of dissolved organic carbon during stormflow in a headwater agricultural catchment , 2009 .

[29]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[30]  J. McDonnell,et al.  Base cation concentrations in subsurface flow from a forested hillslope: The role of flushing frequency , 1998 .

[31]  Chantal Gascuel-Odoux,et al.  Suspended sediment and discharge relationships to identify bank degradation as a main sediment source on small agricultural catchments , 2007 .

[32]  Chantal Gascuel-Odoux,et al.  Role of water table dynamics on stream nitrate export and concentration in agricultural headwater catchment (France) , 2008 .

[33]  R. Merz,et al.  A process typology of regional floods , 2003 .

[34]  Ton H. Snelder,et al.  Predictive mapping of the natural flow regimes of France , 2009 .

[35]  P. Mulholland,et al.  Seasonal patterns in streamwater nutrient and dissolved organic carbon concentrations: Separating catchment flow path and in‐stream effects , 1997 .

[36]  Peter Rogerson,et al.  Statistical methods for geography , 2001 .

[37]  A. Pierson‐Wickmann,et al.  Carbon isotopes as tracers of dissolved organic carbon sources and water pathways in headwater catchments , 2011 .

[38]  Elena Toth,et al.  Catchment classification based on characterisation of streamflow and precipitation time series , 2012 .

[39]  Stephen J. Roberts,et al.  A tutorial on variational Bayesian inference , 2012, Artificial Intelligence Review.