Data Mining Application in Assessment of Weather-Based Influent Scenarios for a WWTP: Getting the Most Out of Plant Historical Data

Since the introduction of environmental legislations and directives, the impact of combined sewer overflows (CSO) on receiving water bodies has become a priority concern in water and wastewater treatment industry. Time-consuming and expensive local sampling and monitoring campaigns are usually carried out to estimate the characteristic flow and pollutant concentrations of CSO water. This study focuses on estimating the frequency and duration of wet-weather events and their impacts on influent flow and wastewater characteristics of the largest Italian wastewater treatment plant (WWTP) located in Castiglione Torinese. Eight years (viz. 2009–2016) of historical data in addition to arithmetic mean daily precipitation rates (PI) of the plant catchment area are elaborated. Relationships between PI and volumetric influent flow rate (Qin), chemical oxygen demand (COD), ammonium (N-NH4), and total suspended solids (TSS) are investigated. A time series data mining (TSDM) method is implemented with MATLAB computing package for segmentation of time series by use of a sliding window algorithm (SWA) to partition the available records associated with wet and dry weather events. According to the TSDM results, a case-specific wet-weather definition is proposed for the Castiglione Torinese WWTP. Two significant weather-based influent scenarios are assessed by kernel density estimation. The results confirm that the method suggested within this study based on plant routinely collected data can be used for planning the emergency response and long-term preparedness for extreme climate conditions in a WWTP. Implementing the obtained results in dynamic process simulation models can improve the plant operational efficiency in managing the fluctuating loads.

[1]  R. H. Moore,et al.  Some Grubbs-Type Statistics for the Detection of Several Outliers , 1972 .

[2]  P. Chatellier,et al.  Observed and simulated effect of rain events on the behaviour of an activated sludge plant removing nitrogen , 2003 .

[3]  Ahmet Altin,et al.  Flow-rate and pollution characteristics of domestic wastewater , 2003 .

[4]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[5]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[6]  Jerónimo Puertas,et al.  Determination of COD, BOD, and suspended solids loads during combined sewer overflow (CSO) events in some combined catchments in Spain , 2005 .

[7]  A. N. Franzblau,et al.  A primer of statistics for non-statisticians. , 1958 .

[8]  Cláudia Antunes,et al.  Temporal Data Mining: an overview , 2001 .

[9]  Stephanie Thalberg,et al.  Wastewater Engineering Treatment Disposal And Reuse , 2016 .

[10]  Richard O. Mines,et al.  The Impact of Rainfall on Flows and Loadings at Georgia’s Wastewater Treatment Plants , 2007 .

[11]  Aristides Gionis,et al.  Finding recurrent sources in sequences , 2003, RECOMB '03.

[12]  Paul Lessard,et al.  Behaviour of a small wastewater treatment plant during rain events , 1997 .

[13]  Fu-Lai Chung,et al.  Evolutionary segmentation of financial time series into subsequences , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[14]  Marina Milanović,et al.  ALGORITHMIC METHODS FOR SEGMENTATION OF TIME SERIES: AN OVERVIEW , 2014 .

[15]  Tak-Chung Fu,et al.  An evolutionary approach to pattern-based time series segmentation , 2004, IEEE Transactions on Evolutionary Computation.

[16]  B. Rosner Percentage Points for a Generalized ESD Many-Outlier Procedure , 1983 .

[17]  Dale E. Seborg,et al.  Application of steady-state and dynamic modeling for the prediction of the BOD of an aerated lagoon at a pulp and paper mill Part II. Nonlinear approaches , 2004 .

[18]  Erik Mostert,et al.  The European Water Framework Directive and water management research , 2003 .

[19]  P. M. Berthouex,et al.  Evaluation of treatment plant performance: causes, frequency, and duration of upsets , 1986 .

[20]  Chad M. Cristina,et al.  First Flush Concepts for Suspended and Dissolved Solids in Small Impervious Watersheds , 2004 .

[21]  Tak-Chung Fu,et al.  A review on time series data mining , 2011, Eng. Appl. Artif. Intell..

[22]  Daniel J. Rosenkrantz,et al.  Segmentation of Time Series Data , 2009, Encyclopedia of Data Warehousing and Mining.

[23]  VARUN CHANDOLA,et al.  Outlier Detection : A Survey , 2007 .

[24]  Paul R. Anderson,et al.  Defining Influent Scenarios: Application of Cluster Analysis to a Water Reclamation Plant , 2015 .

[25]  J.-L. Bertrand-Krajewski,et al.  Flow and pollutant measurements in a combined sewer system to operate a wastewater treatment plant and its storage tank during storm events , 1995 .

[26]  P. E. Richard Field,et al.  Overview of EPA's wet-weather flow research program , 2001 .

[27]  M. Angelidis,et al.  Systematic analysis of the operational response of activated sludge process to variable wastewater flows. A case study , 2002 .

[28]  DAVID G. KENDALL,et al.  Introduction to Mathematical Statistics , 1947, Nature.

[29]  F. E. Grubbs Procedures for Detecting Outlying Observations in Samples , 1969 .

[30]  K. Muller,et al.  An R2 statistic for fixed effects in the linear mixed model , 2008, Statistics in medicine.

[31]  S. Durrans,et al.  Historical development of wet-weather flow management , 1999 .

[32]  Tom D. Reynolds,et al.  Unit Operations and Processes in Environmental Engineering , 1995 .

[33]  Milton Mori,et al.  Application of steady-state and dynamic modeling for the prediction of the BOD of an aerated lagoon , 2004 .

[34]  George Tchobanoglous,et al.  Wastewater Engineering Treatment Disposal Reuse , 1972 .

[35]  S. Sclove ON SEGMENTATION OF TIME SERIES , 1983 .

[36]  R. Schilperoort Monitoring as a tool for the assessment of wastewater quality dynamics , 2011 .

[37]  Introduction to Mathematical Statistics , 1976 .

[38]  S. Burian,et al.  Urban Wet‐Weather Flows , 2001 .