Random Forests, a supervised machine learning algorithm, provides a robust, data driven means of predicting lithology from geophysical, geochemical and remote sensing data. As an essential part of input selection, datasets are ranked in order of importance to the classification outcome. Those ranked most important provide, on average, the most decisive split between lithological classes. These rankings provide explorers with an additional line of reasoning to complement conventional, geophysical and geochemical interpretation workflows. The approach shows potential to aid in identifying important criteria for distinguishing geological map units during early stage exploration. This can assist in directing subsequent expenditure towards the acquisition and further development of datasets which will be the most productive for mapping. In this case study, we use Random Forests to classify the lithology of a project in the Central African Copper-Belt, Zambia. The project area boasts extensive magnetic, radiometric, electromagnetic and multi-element geochemical coverage but only sparse geological observations. Under various training data paradigms, Random Forests produced a series of varying but closely related lithological maps. In this study, training data were restricted to outcrop, simulating the data available at the early stages of the project. Variable ranking highlighted those datasets which were of greatest importance to the result. Both geophysical and geochemical datasets were well represented in the highest ranking variables, reinforcing the importance of access to both data types. Further analysis showed that in many cases, the importance of high ranking datasets had a plausible geological explanation, often consistent with conventional interpretation. In other cases the method provides new insights, identifying datasets which may not have been considered from the outset of a new project.
[1]
Martin Mozina,et al.
Orange: data mining toolbox in python
,
2013,
J. Mach. Learn. Res..
[2]
Matthew J. Cracknell,et al.
Geological mapping using remote sensing data: A comparison of five machine learning algorithms, their response to variations in the spatial distribution of training data and the use of explicit spatial information
,
2014,
Comput. Geosci..
[3]
Ashutosh Kumar Singh,et al.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
,
2010
.
[4]
Stephen Kuhn,et al.
Lithological mapping via Random Forests: Information Entropy as a proxy for inaccuracy
,
2016
.
[5]
Leo Breiman,et al.
Random Forests
,
2001,
Machine Learning.
[6]
Michael Burch,et al.
Generalized Pythagoras Trees for visualizing hierarchies
,
2014,
2014 International Conference on Information Visualization Theory and Applications (IVAPP).
[7]
M. Cracknell,et al.
Mapping geology and volcanic-hosted massive sulfide alteration in the Hellyer–Mt Charter region, Tasmania, using Random Forests™ and Self-Organising Maps
,
2014
.
[8]
M. Hitzman,et al.
Geology of the Enterprise Hydrothermal Nickel Deposit, North-Western Province, Zambia
,
2015
.
[9]
J. Pearce,et al.
Petrogenetic implications of Ti, Zr, Y, and Nb variations in volcanic rocks
,
1979
.
[10]
W. Maclean,et al.
Lithogeochemical techniques using immobile elements
,
1993
.