Detection and forecasting of sludge bulking events using data mining and machine learning approach

DETECTION AND FORECASTING OF SLUDGE BULKING EVENTS USING DATA MINING AND MACHINE LEARNING APPROACH Yuanhao Zhao, B.E. Marquette University, 2012 Sludge bulking is the most notable cause of activated sludge plant failure (i.e. exceeding discharge permit quality limits) worldwide. Numerous mathematical methods have been applied to detect and provide warning for the prevention of sludge bulking. However, these models often fail to reliably forecast sludge bulking events because they focus on the point-by-point “curve-fitting” strategy, while the number of bulking event data points is relatively small in comparison with the large amount of data in the time series. Therefore, three machine learning approaches which focus on detecting the temporal pattern data before the sludge bulking events are considered in this study. The main objective of this research is to apply machine learning and statistical methods to detect the hidden temporal patterns in the sludge volume index (SVI) data and related water-quality parameters occurring before high SVI values (sludge bulking) occur, and then the hidden temporal patterns can be used to forecast high SVI values in the future. Three methods are applied in this research, the improved Time Series Data Mining (TSDM) method, the Hidden Markov Models (HMMs) method, and the combined method of Hidden Markov Models and multinomial logistic regression (MLR). The results and analysis show that the improved TSDM method and the HMMs method are capable to detect and predict sludge bulking events. The improved TSDM method can have a sludge bulking event prediction accuracy between 60% and 100%. The HMMs method could provide warning information to the WWTP operators, even if the HMMs method only detects the first state of the pattern leading to sludge bulking. Once the first pattern state was detected, there was high probability (>80% in all cases, mostly > 90%) that sludge bulking would occur. However, both of these methods have limitations because they are new methods applied to the sludge bulking problem. For the combined method, although the results are not useful for the detection of sludge bulking, some wastewater quality parameters are found to have significant impact on the sludge bulking, i.e., sludge retention time (SRT) and effluent pH for all three batteries.

[1]  Andrea G. Capodaglio,et al.  Sludge bulking analysis and forecasting: Application of system identification and artificial neural computing technologies , 1991 .

[2]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[3]  Richard J. Povinelli,et al.  Time series data mining: identifying temporal patterns for characterization and prediction of time series events , 1999 .

[4]  Weng Tat Chan,et al.  A Knowledge-Based Framework for the Diagnosis of Sludge Bulking in the Activated Sludge Process , 1991 .

[5]  Lluís A. Belanche Muñoz,et al.  Prediction of the bulking phenomenon in wastewater treatment plants , 2000, Artif. Intell. Eng..

[6]  M. V. van Loosdrecht,et al.  Filamentous bulking sludge--a critical review. , 2004, Water research.

[7]  Kazushi Tsumura,et al.  Computer-based filamentous microorganism identification support system , 1988, Proceedings of the International Workshop on Artificial Intelligence for Industrial Applications.

[8]  Richard J. Povinelli,et al.  A New Temporal Pattern Identification Method for Characterization and Prediction of Complex Time Series Events , 2003, IEEE Trans. Knowl. Data Eng..

[9]  M Poch,et al.  Dynamic reasoning to solve complex problems in activated sludge processes: a step further in decision support systems. , 2006, Water science and technology : a journal of the International Association on Water Pollution Research.

[10]  C. Forster Activated sludge surfaces in relation to the sludge volume index , 1971 .

[11]  Murat Kulahci,et al.  Introduction to Time Series Analysis and Forecasting , 2008 .

[12]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[13]  G. Sürücü,et al.  Effects of operational parameters on the settling properties of activated sludge , 1989 .

[14]  W. Gujer,et al.  Influences of wastewater composition and operating conditions on activated sludge bulking and scum formation , 1994 .

[15]  David Grangier,et al.  Machine Learning for Information Retrieval , 2008 .

[16]  J. Angosto,et al.  Evaluation of physichochemical parameters influencing bulking episodes in a municipal wastewater treatment plant , 2006 .

[17]  H. D. Stensel,et al.  Wastewater Engineering: Treatment and Reuse , 2002 .

[18]  Stephen L. Chiu,et al.  Fuzzy Model Identification Based on Cluster Estimation , 1994, J. Intell. Fuzzy Syst..

[19]  M. Yasuda THE INFLUENCE OF pH AND ORGANIC LOADING ON THE FILAMENTOUS BULKING OF ACTIVATED SLUDGE , 1976 .

[20]  L. Baum,et al.  Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[21]  W. Gujer,et al.  Activated sludge model No. 3 , 1995 .

[22]  Faisal Hossain,et al.  An algorithmic approach for system-specific modelling of activated sludge bulking in an SBR , 2000, Environ. Model. Softw..

[23]  J. Chudoba Control of activated sludge filamentous bulking—VI. Formulation of basic principles , 1985 .

[24]  Jan A Snyman,et al.  Practical Mathematical Optimization: An Introduction to Basic Optimization Theory and Classical and New Gradient-Based Algorithms , 2005 .

[25]  J. Chudoba,et al.  Control of activated sludge filamentous bulking: experimental verification of a kinetic selection theory , 1985 .

[26]  Hai Huang,et al.  A fuzzy-set-based Reconstructed Phase Space method for identification of temporal patterns in complex time series , 2005, IEEE Transactions on Knowledge and Data Engineering.

[27]  P A Vanrolleghem,et al.  Evaluation of the impacts of model-based operation of SBRs on activated sludge microbial community. , 2006, Water science and technology : a journal of the International Association on Water Pollution Research.

[28]  D. Jenkins,et al.  Unified theory of filamentous activated sludge bulking , 1978 .

[29]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[30]  H. Kantz,et al.  Nonlinear time series analysis , 1997 .

[31]  M. B. Beck,et al.  Modelling, Control and On-Line Estimation of Activated Sludge Bulking , 1993 .

[32]  John G. Orme,et al.  Multiple Regression With Discrete Dependent Variables , 2009 .

[33]  P. Madoni Survey of filamentous microorganisms from bulking and foaming activated-sludge plants in Italy , 2000 .

[34]  Jehng-Jung Kao,et al.  Computer-based environment for wastewater treatment plant design , 1993 .

[35]  F. Takens Detecting strange attractors in turbulence , 1981 .

[36]  Jens Timmer,et al.  Handbook of Time Series Analysis , 2006 .