Random forest predictive modeling of mineral prospectivity with small number of prospects and data with missing values in Abra (Philippines)

Machine learning methods that have been used in data-driven predictive modeling of mineral prospectivity (e.g., artificial neural networks) invariably require large number of training prospect/locations and are unable to handle missing values in certain evidential data. The Random Forests (RF) algorithm, which is a machine learning method, has recently been applied to data-driven predictive mapping of mineral prospectivity, and so it is instructive to further study its efficacy in this particular field. This case study, carried out using data from Abra (Philippines), examines (a) if RF modeling can be used for data-driven modeling of mineral prospectivity in areas with a few (i.e., <20) mineral occurrences and (b) if RF modeling can handle evidential data with missing values. We found that RF modeling outperforms weights-of-evidence (WofE) modeling of porphyry-Cu prospectivity in the Abra area, where 12 porphyry-Cu prospects are known to exist. Moreover, just like WofE modeling, RF modeling allows analysis of the spatial associations of known prospects with individual layers of evidential data. Furthermore, RF modeling can handle missing values in evidential data through an RF-based imputation technique whereas in WofE modeling values are simply represented by zero weights. Therefore, the RF algorithm is potentially more useful than existing methods that are currently used for data-driven predictive mapping of mineral prospectivity. In particular, it is not a purely black-box method like artificial neural networks in the context of data-driven predictive modeling of mineral prospectivity. However, further testing of the method in other areas with a few mineral occurrences is needed to fully investigate its usefulness in data-driven predictive modeling of mineral prospectivity. The Random Forest (RF) algorithm is tested data-driven modeling of mineral prospectivity.The RF algorithm can be used in areas with few (i.e., <20) mineral occurrences.The RF algorithm can handle evidential data with missing values.The RF algorithm allows analysis of spatial associations of prospects with every evidence layer.

[1]  C. Chung,et al.  Computer program for the logistic model to estimate the probability of occurrence of discrete events , 1978 .

[2]  Andrew Skabar,et al.  Mapping Mineralization Probabilities using Multilayer Perceptrons , 2005 .

[3]  Q. Cheng,et al.  Conditional Independence Test for Weights-of-Evidence Modeling , 2002 .

[4]  F. Agterberg,et al.  Weights of evidence modelling: a new approach to mapping mineral potential , 1990 .

[5]  G. Bonham-Carter,et al.  Uncertainty management in integration of exploration data using the belief function , 1994 .

[6]  E. Carranza,et al.  Logistic regression for geologically constrained mapping of gold potential, Baguio district, Philippines , 2001 .

[7]  Gary King,et al.  Logistic Regression in Rare Events Data , 2001, Political Analysis.

[8]  Petra Perner,et al.  Machine Learning and Data Mining in Pattern Recognition , 2009, Lecture Notes in Computer Science.

[9]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[10]  G. Bonham-Carter,et al.  VHMS favourability mapping with GIS-based integration models, Chisel Lake-Anderson Lake area , 1996 .

[11]  David I. Groves,et al.  Use of fuzzy membership input layers to combine subjective geological knowledge to the Earth works. , 2003 .

[12]  D. Singer,et al.  Application of a feedforward neural network in the search for Kuroko deposits in the Hokuroku district, Japan , 1996 .

[13]  Peter Bühlmann,et al.  MissForest - non-parametric missing value imputation for mixed-type data , 2011, Bioinform..

[14]  G. Bonham-Carter,et al.  A Decision-Tree Approach to Mineral Potential Mapping in Snow Lake Area, Manitoba , 1991 .

[15]  F. Agterberg,et al.  Statistical Pattern Integration for Mineral Exploration , 1990 .

[16]  C. R. S. Souza Filho,et al.  Targeting of Gold Deposits in Amazonian Exploration Frontiers using Knowledge- and Data-Driven Spatial Modeling of Geophysical, Geochemical, and Geological Data , 2012, Surveys in Geophysics.

[17]  LeiteEmilson Pereira,et al.  Probabilistic neural networks applied to mineral potential mapping for platinum group elements in the Serra Leste region, Carajás Mineral Province, Brazil , 2009 .

[18]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[19]  Mario Chica-Olmo,et al.  An assessment of the effectiveness of a random forest classifier for land-cover classification , 2012 .

[20]  P. Bühlmann,et al.  Analyzing Bagging , 2001 .

[21]  Victor F. Rodriguez-Galiano,et al.  Predictive modelling of gold potential with the integration of multisource information based on random forest: a case study on the Rodalquilar area, Southern Spain , 2014, Int. J. Geogr. Inf. Sci..

[22]  R. Sillitoe,et al.  Philippine porphyry copper deposits : geologic setting and characteristics , 1984 .

[23]  M. Cracknell,et al.  Mapping geology and volcanic-hosted massive sulfide alteration in the Hellyer–Mt Charter region, Tasmania, using Random Forests™ and Self-Organising Maps , 2014 .

[24]  J. R. Sveinsson,et al.  Mapping of hyperspectral AVIRIS data using machine-learning algorithms , 2009 .

[25]  Alok Porwal,et al.  A Hybrid Neuro-Fuzzy Model for Mineral Potential Mapping , 2004 .

[26]  Saro Lee,et al.  Application of Artificial Neural Network for Gold–Silver Deposits Potential Mapping: A Case Study of Korea , 2010 .

[27]  E. Carranza,et al.  Mapping of prospectivity and estimation of number of undiscovered prospects for lode gold, southwestern Ashanti Belt, Ghana , 2009 .

[28]  F. Agterberg,et al.  Integration of Geological Datasets for Gold Exploration in Nova Scotia , 2013 .

[29]  D. Stekhoven missForest: Nonparametric missing value imputation using random forest , 2015 .

[30]  Tom Gedeon,et al.  Use of Noise to Augment Training Data: A Neural Network Method of Mineral–Potential Mapping in Regions of Limited Known Deposit Examples , 2003 .

[31]  Norman E. Breslow,et al.  Logistic regression for two-stage case-control data , 1988 .

[32]  J. Harris,et al.  Application of GIS Processing Techniques for Producing Mineral Prospectivity Maps—A Case Study: Mesothermal Au in the Swayze Greenstone Belt, Ontario, Canada , 2001 .

[33]  Emmanuel John M. Carranza,et al.  Geocomputation of mineral exploration targets , 2011, Comput. Geosci..

[34]  José Augusto Baranauskas,et al.  How Many Trees in a Random Forest? , 2012, MLDM.

[35]  D. Singer,et al.  A Comparison of the Weights-of-Evidence Method and Probabilistic Neural Networks , 1999 .

[36]  Abbas Bahroudi,et al.  Support vector machine for multi-classification of mineral prospectivity areas , 2012, Comput. Geosci..

[37]  F. P. Agterberg,et al.  Poisson Regression Analysis and its Application , 1988 .

[38]  Emmanuel John M. Carranza,et al.  Artificial Neural Networks for Mineral-Potential Mapping: A Case Study from Aravalli Province, Western India , 2003 .

[39]  Chang-Jo Chung,et al.  On Blind Tests and Spatial Prediction Models , 2008 .

[40]  Graeme F. Bonham-Carter,et al.  Measuring the Performance of Mineral-Potential Maps , 2005 .

[41]  E. Carranza,et al.  Data-driven predictive mapping of gold prospectivity, Baguio district, Philippines: Application of Random Forests algorithm , 2015 .

[42]  J. Harris,et al.  Gold prospectivity maps of the Red Lake greenstone belt: application of GIS technology , 2006 .

[43]  R. Sillitoe Porphyry Copper Systems , 2010 .

[44]  Emmanuel John M. Carranza,et al.  Objective selection of suitable unit cell size in data-driven modeling of mineral prospectivity , 2009, Comput. Geosci..

[45]  E. Carranza Controls on mineral deposit occurrence inferred from analysis of their spatial pattern and spatial association with geological features , 2009 .

[46]  G. Bonham-Carter Geographic Information Systems for Geoscientists: Modelling with GIS , 1995 .

[47]  W. M. Moon,et al.  An object-oriented knowledge representation structure for exploration data integration , 1994 .

[48]  Chang-Jo F. Chung,et al.  SIMSAG: Integrated computer system for use in evaluation of mineral and energy resources , 1983 .

[49]  Abarquez Ramon Mineral resources of the Philippines , 1940 .

[50]  Renguang Zuo,et al.  Support vector machine: A tool for mapping mineral prospectivity , 2011, Comput. Geosci..

[51]  J. Evans,et al.  Gradient modeling of conifer species using random forests , 2009, Landscape Ecology.

[52]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[53]  P. Behnia,et al.  Application of Radial Basis Functional Link Networks to Exploration for Proterozoic Mineral Deposits in Central Iran , 2007 .

[54]  Emmanuel John M. Carranza,et al.  A catchment basin approach to the analysis of reconnaissance geochemical - geological data from Albay province, Philippines , 1997 .

[55]  E. Carranza Data-Driven Evidential Belief Modeling of Mineral Potential Using Few Prospects and Evidence with Missing Values , 2015, Natural Resources Research.

[56]  Tom Gedeon,et al.  Artificial neural networks: A new method for mineral prospectivity mapping , 2000 .

[57]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[58]  Carlos Roberto de Souza Filho,et al.  Probabilistic neural networks applied to mineral potential mapping for platinum group elements in the Serra Leste region, Carajás Mineral Province, Brazil , 2009, Comput. Geosci..

[59]  A. G. Fabbri,et al.  Quantitative analysis of mineral and energy resources , 1987 .

[60]  E. Carranza Geochemical Anomaly and Mineral Prospectivity Mapping in Gis , 2012 .

[61]  E. Carranza,et al.  Evidential belief functions for data-driven geologically constrained mapping of gold potential, Baguio district, Philippines , 2003 .

[62]  E. Carranza Weights of Evidence Modeling of Mineral Potential: A Case Study Using Small Number of Prospects, Abra, Philippines , 2004 .

[63]  E. Carranza,et al.  Spatial data analysis and integration for regional-scale geothermal potential mapping, West Java, Indonesia , 2008 .

[64]  Mario Chica-Olmo,et al.  Artificial neural networks as a tool for mineral potential mapping with GIS , 2003 .

[65]  A. Prasad,et al.  Newer Classification and Regression Tree Techniques: Bagging and Random Forests for Ecological Prediction , 2006, Ecosystems.

[66]  C. Pipper,et al.  [''R"--project for statistical computing]. , 2008, Ugeskrift for laeger.

[67]  Guillaume Caumon,et al.  Curvature Attribute from Surface-Restoration as Predictor Variable in Kupferschiefer Copper Potentials , 2015, Natural Resources Research.

[68]  E. Carranza From Predictive Mapping of Mineral Prospectivity to Quantitative Estimation of Number of Undiscovered Prospects , 2011 .

[69]  Matthew J. Cracknell,et al.  Geological mapping using remote sensing data: A comparison of five machine learning algorithms, their response to variations in the spatial distribution of training data and the use of explicit spatial information , 2014, Comput. Geosci..

[70]  E. Carranza,et al.  Application of Data-Driven Evidential Belief Functions to Prospectivity Mapping for Aquamarine-Bearing Pegmatites, Lundazi District, Zambia , 2005 .

[71]  Karl-Heinz Jöckel,et al.  Logistic analysis in case-control studies under validation sampling , 1993 .

[72]  Guocheng Pan,et al.  A Comparative Analysis of Favorability Mappings by Weights of Evidence, Probabilistic Neural Networks, Discriminant Analysis, and Logistic Regression , 2003 .

[73]  J. Stephan,et al.  Structure and geological history of the Lepanto-Cervantes releasing bend on the Abra River Fault, Luzon Central Cordillera, Philippines , 1990 .

[74]  David J. Unwin,et al.  Point Pattern Analysis , 2010 .

[75]  V. Nykänen Radial Basis Functional Link Nets Used as a Prospectivity Mapping Tool for Orogenic Gold Deposits Within the Central Lapland Greenstone Belt, Northern Fennoscandian Shield , 2008 .

[76]  F. Agterberg,et al.  Regression models for estimating mineral resources from geological map data , 1980 .

[77]  Jürgen Symanzik,et al.  Statistical Analysis of Spatial Point Patterns , 2005, Technometrics.

[78]  Ute Bradter,et al.  Identifying appropriate spatial scales of predictors in species distribution models with the random forest algorithm , 2013 .

[79]  Matthew J. Cracknell,et al.  The upside of uncertainty: Identification of lithology contact zones from airborne geophysics and satellite data using random forests and support vector machines , 2013 .

[80]  Q. Cheng,et al.  Weights of evidence modeling and weighted logistic regression for mineral potential mapping , 1993 .

[81]  E. Carranza,et al.  Predictive mapping of prospectivity and quantitative estimation of undiscovered VMS deposits in Skellefte district (Sweden) , 2010 .

[82]  E. Carranza,et al.  Selection of coherent deposit-type locations and their application in data-driven mineral prospectivity mapping , 2008 .

[83]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[84]  Johannes R. Sveinsson,et al.  Random Forests for land cover classification , 2006, Pattern Recognit. Lett..

[85]  E. Carranza,et al.  Geologically Constrained Probabilistic Mapping of Gold Potential, Baguio District, Philippines , 2000 .

[86]  H. Elsenbeer,et al.  Soil organic carbon concentrations and stocks on Barro Colorado Island — Digital soil mapping using Random Forests analysis , 2008 .

[87]  Claude Rangin,et al.  The Philippine Mobile Belt: a complex plate boundary , 1991 .

[88]  Carlos Roberto de Souza Filho,et al.  Artificial neural networks applied to mineral potential mapping for copper‐gold mineralizations in the Carajás Mineral Province, Brazil , 2009 .

[89]  Maysam Abedi,et al.  Integration of various geophysical data with geological and geochemical data to determine additional drilling for copper exploration , 2012 .

[90]  Alice G. Laborte,et al.  Opportunities for expanding paddy rice production in Laos: spatial predictive modeling using Random Forest , 2012 .

[91]  Tom Gedeon,et al.  Use of Fuzzy Membership Input Layers to Combine Subjective Geological Knowledge and Empirical Data in a Neural Network Method for Mineral-Potential Mapping , 2003 .

[92]  E. Carranza,et al.  Where Are Porphyry Copper Deposits Spatially Localized? A Case Study in Benguet Province, Philippines , 2002 .