Improving single classifiers prediction accuracy for underground water pump station in a gold mine using ensemble techniques

In this paper six single classifiers (support vector machine, artificial neural network, naïve Bayesian classifier, decision trees, radial basis function and k nearest neighbors) were utilized to predict water dam levels in a deep gold mine underground pump station. Also, Bagging and Boosting ensemble techniques were used to increase the prediction accuracy of the single classifiers. In order to enhance the prediction accuracy even more a mutual information ensemble approach is introduced to improve the single classifiers and the Bagging and Boosting prediction results. This ensemble is used to classify, thus monitoring and predicting the underground water dam levels on a single-pump station deep gold mine in South Africa, Mutual information theory is used in order to determine the classifiers optimum number to build the most accurate ensemble. In terms of prediction accuracy, the results show that the mutual information ensemble over performed the other used ensembles and single classifiers and is more efficient for classification of underground water dam levels. However the ensemble construction is more complicated than the Bagging and Boosting techniques.

[1]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  Benjamin Naumann,et al.  Learning And Soft Computing Support Vector Machines Neural Networks And Fuzzy Logic Models , 2016 .

[4]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[5]  Heinrich Roder,et al.  On the Reliability of kNN Classification , 2007 .

[6]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[7]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[8]  M. Shaw,et al.  Induction of fuzzy decision trees , 1995 .

[9]  Tshilidzi Marwala,et al.  Predicting mine dam levels and energy consumption using artificial intelligence methods , 2013, 2013 IEEE Symposium on Computational Intelligence for Engineering Solutions (CIES).

[10]  Martin D. Buhmann,et al.  Radial Basis Functions: Theory and Implementations: Preface , 2003 .

[11]  Ravi Kothari,et al.  Look-ahead based fuzzy decision tree induction , 2001, IEEE Trans. Fuzzy Syst..

[12]  Robert P. W. Duin,et al.  Bagging, Boosting and the Random Subspace Method for Linear Classifiers , 2002, Pattern Analysis & Applications.

[13]  Peter Bühlmann,et al.  Bagging, Boosting and Ensemble Methods , 2012 .

[14]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[15]  Mark Zwolinski,et al.  Mutual Information Theory for Adaptive Mixture Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Karl Rihaczek,et al.  1. WHAT IS DATA MINING? , 2019, Data Mining for the Social Sciences.

[17]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[18]  M. C. Jones,et al.  E. Fix and J.L. Hodges (1951): An Important Contribution to Nonparametric Discriminant Analysis and Density Estimation: Commentary on Fix and Hodges (1951) , 1989 .

[19]  Vojislav Kecman,et al.  Learning and Soft Computing: Support Vector Machines, Neural Networks, and Fuzzy Logic Models , 2001 .

[20]  Richard K. Beatson,et al.  Fast Solution of the Radial Basis Function Interpolation Equations: Domain Decomposition Methods , 2000, SIAM J. Sci. Comput..

[21]  Xiaoqiang Luo,et al.  Using Bagging and Boosting Techniques for Improving Coreference Resolution , 2010, Informatica.

[22]  Rajeshri R. Shelke,et al.  Simplified Approach of ANN: Strengths and Weakness , 2012 .

[23]  Wei Fan,et al.  Bagging , 2009, Encyclopedia of Machine Learning.

[24]  Hung Hum,et al.  Is Naïve Bayes a Good Classifier for Document Classification , 2011 .

[26]  Selwyn Piramuthu,et al.  Artificial Intelligence and Information Technology Evaluating feature selection methods for learning in data mining applications , 2004 .