A comparison of t-SNE, SOM and SPADE for identifying material type domains in geological data

Abstract The standard mine modelling practice often involves investing significant effort into the interpretation of the deposit and identification of the geological domains using the chemical assays and geophysics. These domains and their accuracy play a key role in grade estimation using spatial modelling approaches, such as Gaussian processes and kriging. However, the domains developed for grade estimation do not always produce regions with well-defined correlations between the material types that are present in the deposit. The material type is based on the ore characteristics, such as mineralogy, texture and other visible petrological properties. Therefore a new and potentially more flexible methodology for domaining of material types is needed. This study applies t-SNE, SOM and SPADE to the material type data to embed high-dimensional data into low dimensions and thus facilitate clustering. These methods were tested on a banded iron formation hosted iron ore deposit located in the Hammersley region of Western Australia. All three methods produced clusters that were purer mixtures of the material types than the original domains. Due to the geologist input SPADE produced clusters closest to the original domains. However, this may not be the best clustering for material type modelling. Additionally, SPADE produced the best results for the ore, and highlighted how the user input can focus this method on the region of greatest interest. t-SNE and SOM are more automatic, but had results that were further from the original clusters. t-SNE identified clusters that were better spatially grouped than SOM, which was generally the most affected by the high variation in material within the detritals.

[1]  Esa Alhoniemi,et al.  Clustering of the self-organizing map , 2000, IEEE Trans. Neural Networks Learn. Syst..

[2]  D. Lascelles The Genesis of the Hope Downs Iron Ore Deposit, Hamersley Province, Western Australia , 2006 .

[3]  Geoffrey E. Hinton,et al.  Visualizing Similarity Data with a Mixture of Maps , 2007, AISTATS.

[4]  Arman Melkumyan,et al.  Detection of Outliers in Geochemical Data Using Ensembles of Subsets of Variables , 2018, Mathematical Geosciences.

[5]  J. Clout Iron formation-hosted iron ores in the Hamersley Province of Western Australia , 2006 .

[6]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[7]  Sean C. Bendall,et al.  Extracting a Cellular Hierarchy from High-dimensional Cytometry Data with SPADE , 2011, Nature Biotechnology.

[8]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[9]  Santiago Torres-Alegre,et al.  Using self-organizing maps to identify potential halo white dwarfs , 2003, Neural Networks.

[10]  T. Kohonen,et al.  Visual Explorations in Finance with Self-Organizing Maps , 1998 .

[11]  Erkki Oja,et al.  Engineering applications of the self-organizing map , 1996, Proc. IEEE.

[12]  Warick Brown,et al.  Advanced methodologies for the analysis of databases of mineral deposits and major faults , 2008 .

[13]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[14]  S. Hagemann,et al.  Banded Iron Formation-Related Iron Ore Deposits of the Hamersley Province, Western Australia , 2008 .

[15]  Marra Mamba Iron Formation stratigraphy in the eastern Chichester Range, Western Australia , 2000 .

[16]  Richard Uden,et al.  Data mining of 3D poststack seismic attribute volumes using Kohonen self-organizing maps , 2002 .

[17]  E. Holden,et al.  A Data Mining Approach to Validating Drill Hole Logging Data in Pilbara Iron Ore Exploration , 2018, Economic Geology.

[18]  Samuel Kaski,et al.  Bibliography of Self-Organizing Map (SOM) Papers: 1981-1997 , 1998 .

[19]  Laurens van der Maaten,et al.  Accelerating t-SNE using tree-based algorithms , 2014, J. Mach. Learn. Res..

[20]  Wang Huai-bin,et al.  A Clustering Algorithm Use SOM and K-Means in Intrusion Detection , 2010, 2010 International Conference on E-Business and E-Government.