Knowledge discovery by means of inductive methods in wastewater treatment plannt data

Artificial intelligence techniques, including machine learning methods, and statistical techniques have shown promising results as decision support tools, because of their capabilities of knowledge discovery, heuristic reasoning and working with uncertain and qualitative information. Wastewater treatment plants are complex environmental processes that are difficult to manage and control. This paper discusses the qualitative and quantitative performance of several machine learning and statistical methods to discover knowledge patterns in data. The methods are tested and compared on a wastewater treatment data set. The methods used are: induction of decision trees, two different techniques of rule induction and two memorydbased learning methods: instancedbased learning and casedbased learning. All the knowledge patterns discovered by the different methods are compared in terms of predictive accuracy, the number of attributes and examples used, and the meaningfuldness to domain experts.

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  John W. Tukey,et al.  Exploratory Data Analysis. , 1979 .

[3]  J Comas,et al.  Development of a Case-Based System for the Supervision of an Activated Sludge Process , 2001, Environmental technology.

[4]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[5]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[6]  Sahibsingh A. Dudani The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[7]  Janet L. Kolodner Casebased Learning , 1993 .

[8]  Peter Secretan Learning , 1965, Mental Health.

[9]  Miquel Sànchez-Marrè,et al.  DAI-DEPUR: an integrated and distributed architecture for wastewater treatment plants supervision , 1996, Artif. Intell. Eng..

[10]  Sholom M. Weiss,et al.  Computer Systems That Learn , 1990 .

[11]  Miquel Sànchez-Marrè,et al.  Semi-automatic learning with quantitative and qualitative features , 2002 .

[12]  Robert C. Holte,et al.  Concept Learning and the Problem of Small Disjuncts , 1989, IJCAI.

[13]  Rand J. Spiro,et al.  Cognitive flexibility theory : advanced knowledge acquisition in ill-structured domains , 1988 .

[14]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[15]  Thomas G. Dietterich,et al.  A study of distance-based machine learning algorithms , 1994 .

[16]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[17]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[18]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[19]  Miquel Sànchez-Marrè,et al.  Sustainable case learning for continuos domains , 1999, Environ. Model. Softw..

[20]  Hannes Werthner,et al.  Environmental decision support systems , 1989 .

[21]  Ulises Cortés,et al.  Clustering based on rules and Knowledge Discovery in ill-structured domains , 1998 .

[22]  Ian D. Watson,et al.  An Introduction to Case-Based Reasoning , 1995, UK Workshop on Case-Based Reasoning.

[23]  Peter Clark,et al.  Rule Induction with CN2: Some Recent Improvements , 1991, EWSL.

[24]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[25]  Agnar Aamodt,et al.  Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches , 1994, AI Commun..

[26]  G. Daigger,et al.  Manual on the causes and control of activated sludge bulking and foaming , 1992 .

[27]  Ivan Bratko,et al.  ASSISTANT 86: A Knowledge-Elicitation Tool for Sophisticated Users , 1987, EWSL.

[28]  Miquel Sànchez-Marrè,et al.  Consider a case-based system for control of complex processes , 1999 .

[29]  Janet L. Kolodner,et al.  Case-Based Reasoning , 1989, IJCAI 1989.

[30]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[31]  David H. Wolpert,et al.  Constructing a generalizer superior to NETtalk via a mathematical theory of generalization , 1990, Neural Networks.

[32]  Karl Rihaczek,et al.  1. WHAT IS DATA MINING? , 2019, Data Mining for the Social Sciences.

[33]  Janet L. Kolodner,et al.  Case-Based Learning , 1993, Springer US.

[34]  Ulises Cortés,et al.  Knowledge Discovery with Clustering Based on Rules. Interpreting Results , 1998, PKDD.