Uncertainty quantification and integration of machine learning techniques for predicting acid rock drainage chemistry: a probability bounds approach.

Acid rock drainage (ARD) is a major pollution problem globally that has adversely impacted the environment. Identification and quantification of uncertainties are integral parts of ARD assessment and risk mitigation, however previous studies on predicting ARD drainage chemistry have not fully addressed issues of uncertainties. In this study, artificial neural networks (ANN) and support vector machine (SVM) are used for the prediction of ARD drainage chemistry and their predictive uncertainties are quantified using probability bounds analysis. Furthermore, the predictions of ANN and SVM are integrated using four aggregation methods to improve their individual predictions. The results of this study showed that ANN performed better than SVM in enveloping the observed concentrations. In addition, integrating the prediction of ANN and SVM using the aggregation methods improved the predictions of individual techniques.

[1]  R. A. Hollister,et al.  A Rank Correlation Coefficient Resistant to Outliers , 1987 .

[2]  Julio J. Valdés,et al.  Computational intelligence in earth sciences and environmental applications: Issues and challenges , 2006, Neural Networks.

[3]  A. Aryafar,et al.  Prediction of heavy metals in acid mine drainage using artificial neural network from the Shur River of the Sarcheshmeh porphyry copper mine, Southeast Iran , 2011 .

[4]  Scott Ferson,et al.  Constructing Probability Boxes and Dempster-Shafer Structures , 2003 .

[5]  R. Gholami,et al.  Heavy metal pollution assessment using support vector machine in the Shur River, Sarcheshmeh copper mine, Iran , 2012, Environmental Earth Sciences.

[6]  Avi Ostfeld,et al.  Data-driven modelling: some past experiences and new approaches , 2008 .

[7]  Sabine Fenstermacher,et al.  Mine Wastes Characterization Treatment And Environmental Impacts , 2016 .

[8]  Nicholas F. Gray,et al.  Acid mine drainage composition and the implications for its impact on lotic systems , 1998 .

[9]  Peter Filzmoser,et al.  Iterative stepwise regression imputation using standard and robust methods , 2011, Comput. Stat. Data Anal..

[10]  P. Walley Statistical Reasoning with Imprecise Probabilities , 1990 .

[11]  Manoj Khandelwal,et al.  Prediction of mine water quality by physical parameters , 2005 .

[12]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[13]  Clemens Reimann,et al.  Multivariate outlier detection in exploration geochemistry , 2005, Comput. Geosci..

[14]  Solomon Tesfamariam,et al.  On the Issue of Incomplete and Missing Water-Quality Data in Mine Site Databases: Comparing Three Imputation Methods , 2014, Mine Water and the Environment.

[15]  Edgar Acuña,et al.  The Treatment of Missing Values and its Effect on Classifier Accuracy , 2004 .

[16]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[17]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[18]  Kari Sentz,et al.  Combination of Evidence in Dempster-Shafer Theory , 2002 .

[19]  A. Bello,et al.  Imputation techniques in regression analysis: looking closely at their implementation , 1995 .

[20]  Adisa Azapagic,et al.  Developing a framework for sustainable development indicators for the mining and minerals Industry , 2004 .

[21]  Nicholas F. Gray,et al.  Field assessment of acid mine drainage contamination in surface and ground water , 1996 .

[22]  Solomon Tesfamariam,et al.  Predicting copper concentrations in acid mine drainage: a comparative analysis of five machine learning techniques , 2013, Environmental Monitoring and Assessment.

[23]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[24]  Bernd G. Lottermoser,et al.  Mine Wastes: Characterization, Treatment and Environmental Impacts , 2003 .

[25]  Clemens Reimann,et al.  Background and threshold: critical comparison of methods of determination. , 2005, The Science of the total environment.

[26]  Scott Ferson,et al.  Probability bounds analysis in environmental risk assessments , 2003 .

[27]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.