Integration of auto-encoder network with density-based spatial clustering for geochemical anomaly detection for mineral exploration

Abstract Auto-encoder network can be used for dimensionality reduction of data and for re-construction of sample population with unknown, complex multivariate probability distribution, where small-probability samples have little contribution to the auto-encoder network, leading to high re-construction error. In this paper, the trained auto-encoder networks were used to detect geochemical anomalies. Compared with deep auto-encoder network, the density-based spatial clustering application with noise (DBSCAN) regards noise samples (e.g., geochemically anomalous samples) that differ from core samples (e.g., geochemically background samples) as anomalies. Therefore, the learned representations from the code layer in the auto-encoder network are clustered by DBSCAN to detect noise samples representing geochemical anomalies. As benchmark for evaluating the performance of auto-encoder network and DBSCAN, and in consideration of the compositional nature of geochemical data, the compositional multivariate outlier detection was also applied. We applied these methods to two forms of the geochemical data, namely (1) without any transformation and (2) with isometric log ratio transformation. The similarities of the resulting anomaly maps in terms of data forms indicate that the auto-encoder network is effective for detecting multivariate geochemical anomalies. Differences between the anomaly maps indicate, however, that the compositional nature of geochemical data affects the performance of multivariate geochemical anomaly detection. Nevertheless, the assessment, by receiver operating characteristics analysis, of the geochemical anomalies derived using the different methodologies described implies that the detected geochemical anomalies are related to Au mineralization. Finally, the Youden index, which measures the relationship between binary anomalies and known deposits, was used for optimal threshold selection to create an optimal mineral potential map from the derived continuous geochemical anomaly data. The spatial distribution of geochemical anomalies at/around faults and magmatic rocks provides insights to where further detailed exploration is warranted in the study area.

[1]  E. Carranza Geochemical Mineral Exploration: Should We Use Enrichment Factors or Log-Ratios? , 2017, Natural Resources Research.

[2]  A. Sinclair Selection of threshold values in geochemical data using probability graphs , 1974 .

[3]  Renguang Zuo,et al.  Recognition of geochemical anomalies using a deep autoencoder network , 2016, Comput. Geosci..

[4]  A. Kröner,et al.  Granulites in the Tongbai Area, Qinling Belt, China: Geochemistry, petrology, single zircon geochronology, and implications for the tectonic evolution of eastern Asia , 1993 .

[5]  Yongliang Chen,et al.  Application of continuous restricted Boltzmann machine to identify multivariate geochemical anomaly , 2014 .

[6]  Zhangqun Li,et al.  Geochemical and Pb-Sr-Nd isotopic compositions of granitoids from western Qinling belt: Constraints on basement nature and tectonic affinity , 2007 .

[7]  V. Pawlowsky-Glahn,et al.  Compositional data and their analysis: an introduction , 2006, Geological Society, London, Special Publications.

[8]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[9]  P. Filzmoser,et al.  Outlier Detection for Compositional Data Using Robust Methods , 2008 .

[10]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[11]  P. Filzmoser,et al.  The bivariate statistical analysis of environmental (compositional) data. , 2010, The Science of the total environment.

[12]  Yongliang Chen,et al.  A prospecting cost-benefit strategy for mineral potential mapping based on ROC curve analysis , 2016 .

[13]  Angshul Majumdar,et al.  Graph structured autoencoder , 2018, Neural Networks.

[14]  Yunpeng Dong,et al.  Tectonic evolution of the Qinling orogen, China: Review and synthesis , 2011 .

[15]  S. Kim,et al.  Mesozoic magmatism in the eastern North China Craton: Insights on tectonic cycles associated with progressive craton destruction , 2018, Gondwana Research.

[16]  Zhang Hongfei,et al.  Petrogenesis and tectonic implications of the Early Indosinian Meiwu Pluton in West Qinling,central China , 2012 .

[17]  Gregory F. Piepel,et al.  The Statistical Analysis of Compositional Data , 1988 .

[18]  S. Verma,et al.  Discriminating four tectonic settings: Five new geochemical diagrams for basic and ultrabasic volcanic rocks based on log — ratio transformation of major-element data , 2006 .

[19]  Li Tang,et al.  Extensive crustal melting during craton destruction: Evidence from the Mesozoic magmatic suite of Junan, eastern North China Craton , 2017 .

[20]  P. Filzmoser,et al.  Normal and lognormal data distribution in geochemistry: death of a myth. Consequences for the statistical treatment of geochemical and environmental data , 2000 .

[21]  A. Kröner,et al.  A Middle Silurian-Early Devonian Magmatic Arc in the Qinling Mountains of Central China , 1995, The Journal of Geology.

[22]  Andrew P. Valentine,et al.  Data space reduction, quality assessment and searching of seismograms: autoencoder networks for waveform data , 2012 .

[23]  E. Carranza,et al.  Mapping mineral prospectivity through big data analytics and a deep learning algorithm , 2018, Ore Geology Reviews.

[24]  C. Keller,et al.  Multivariate interpolation to incorporate thematic surface data using inverse distance weighting (IDW) , 1996 .

[25]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[26]  P. Rousseeuw Multivariate estimation with high breakdown point , 1985 .

[27]  Guowei Zhang,et al.  Geologic framework and tectonic evolution of the Qinling orogen, central China , 2000 .

[28]  Qiuming Cheng,et al.  Application of singularity mapping technique to identify local anomalies using stream sediment geochemical data, a case study from Gangdese, Tibet, western China , 2009 .

[29]  Emmanuel John M. Carranza,et al.  Supervised geochemical anomaly detection by pattern recognition , 2015 .

[30]  C. Chung,et al.  Probabilistic prediction models for landslide hazard mapping , 1999 .

[31]  Yihong Gong,et al.  Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks , 2008, ECCV.

[32]  A. Buccianti,et al.  Weighted principal component analysis for compositional data: application example for the water chemistry of the Arno river (Tuscany, central Italy) , 2013 .

[33]  V. Pawlowsky-Glahn,et al.  Modeling and Analysis of Compositional Data , 2015 .

[34]  Zhang Guowei,et al.  Mianle tectonic zone and Mianle suture zone on southern margin of Qinling-Dabie orogenic belt , 2004 .

[35]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[36]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[37]  H. Moeini,et al.  Comparing compositional multivariate outliers with autoencoder networks in anomaly detection at Hamich exploration area, east of Iran , 2017 .

[38]  Yongliang Chen Mineral potential mapping with a restricted Boltzmann machine , 2015 .

[39]  V. Pawlowsky-Glahn,et al.  Relative vs. absolute statistical analysis of compositions: a comparative study of surface waters of a Mediterranean river. , 2005, Water research.

[40]  Jin Wei SHRIMP dating of adakites in western Qinling and their implications. , 2005 .

[41]  Antonella Buccianti,et al.  Is compositional data analysis a way to see beyond the illusion? , 2013, Comput. Geosci..

[42]  H. E. Hawkes,et al.  Geochemistry in Mineral Exploration , 1962 .

[43]  Jian-wei Li,et al.  The Dewulu reduced Au-Cu skarn deposit in the Xiahe-Hezuo district, West Qinling orogen, China: Implications for an intrusion-related gold system , 2017 .

[44]  P. Filzmoser,et al.  Error Propagation in Isometric Log-ratio Coordinates for Compositional Data: Theoretical and Practical Considerations , 2016, Mathematical Geosciences.

[45]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[46]  Shou‐ting Zhang,et al.  Triassic alkaline magmatism and mineralization in the Xiong'ershan area, East Qinling, China , 2019 .

[47]  Qiuming Cheng,et al.  Singularity theory and methods for mapping geochemical anomalies caused by buried sources and for predicting undiscovered mineral deposits in covered areas , 2012 .

[48]  P. Filzmoser,et al.  Applied Compositional Data Analysis: With Worked Examples in R , 2018 .

[49]  E. Carranza Analysis and mapping of geochemical anomalies using logratio-transformed stream sediment data with c , 2011 .

[50]  Pierre Baldi,et al.  Complex-Valued Autoencoders , 2011, Neural Networks.

[51]  P. Filzmoser,et al.  Principal component analysis for compositional data with outliers , 2009 .

[52]  Emmanuel John M. Carranza,et al.  Multivariate regression analysis of lithogeochemical data to model subsurface mineralization: A case study from the Sari Gunay epithermal gold deposit, NW Iran , 2015 .

[53]  Clemens Reimann,et al.  Interpretation of multivariate outliers for compositional data , 2012, Comput. Geosci..

[54]  Q. Cheng Mapping singularities with stream sediment geochemical data for prediction of undiscovered mineral deposits in Gejiu, Yunnan Province, China , 2007 .

[55]  Clemens Reimann,et al.  Background and threshold: critical comparison of methods of determination. , 2005, The Science of the total environment.

[56]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[57]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[58]  Clemens Reimann,et al.  Multivariate outlier detection in exploration geochemistry , 2005, Comput. Geosci..

[59]  Emmanuel John M. Carranza,et al.  Stepwise regression for recognition of geochemical anomalies: case study in Takab area, NW Iran , 2016 .

[60]  Geoffrey E. Hinton,et al.  Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes , 2007, NIPS.

[61]  G. Mateu-Figueras,et al.  Isometric Logratio Transformations for Compositional Data Analysis , 2003 .