Discovering Spatially Contiguous Clusters in Multivariate Geostatistical Data Through Spectral Clustering

Spectral clustering has recently become one of the most popular modern clustering algorithms for traditional data. However, the application of this clustering method on geostatistical data produces spatially scattered clusters, which is undesirable for many geoscience applications. In this work, we develop a spectral clustering method aimed to discover spatially contiguous and meaningful clusters in multivariate geostatistical data, in which spatial dependence plays an important role. The proposed spectral clustering method relies on a similarity measure built from a non-parametric kernel estimator of the multivariate spatial dependence structure of the data, emphasizing the spatial correlation among data locations. The capability of the proposed spectral clustering method to provide spatially contiguous and meaningful clusters is illustrated using the European Geological Surveys Geochemical database.

[1]  R. Webster,et al.  A geostatistical basis for spatial weighting in multivariate classification , 1989 .

[2]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Spectral methods for graph clustering - A survey , 2011, Eur. J. Oper. Res..

[3]  D. Allard,et al.  Clustering geostatistical data , 2000 .

[4]  Jacques Rivoirard,et al.  Unsupervised classification of multivariate geostatistical data: Two algorithms , 2015, Comput. Geosci..

[5]  W. Tobler A Computer Movie Simulating Urban Growth in the Detroit Region , 1970 .

[6]  Francesco Masulli,et al.  A survey of kernel and spectral methods for clustering , 2008, Pattern Recognit..

[7]  Chris H. Q. Ding,et al.  Spectral Relaxation for K-means Clustering , 2001, NIPS.

[8]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[9]  Dirong Chen,et al.  Consistency of regularized spectral clustering , 2011 .

[10]  Charu C. Aggarwal,et al.  Data Clustering , 2013 .

[11]  Ulrike von Luxburg,et al.  Limits of Spectral Clustering , 2004, NIPS.

[12]  Tomislav Hengl,et al.  Heavy metals in European soils: A geostatistical analysis of the FOREGS geochemical database , 2008 .

[13]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[14]  Francky Fouedjio,et al.  A hierarchical clustering method for multivariate geostatistical data , 2016 .

[15]  J. Chilès,et al.  Geostatistics: Modeling Spatial Uncertainty , 1999 .

[16]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[17]  Joel Michelin,et al.  Inference of a hidden spatial tessellation from multivariate data: application to the delineation of homogeneous regions in an agricultural field , 2006 .

[18]  Gérard Govaert,et al.  Clustering of Spatial Data by the EM Algorithm , 1997 .

[19]  Francky Fouedjio,et al.  A Clustering Approach for Discovering Intrinsic Clusters in Multivariate Geostatistical Data , 2016, MLDM.

[20]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[21]  Santosh S. Vempala,et al.  On clusterings: Good, bad and spectral , 2004, JACM.

[22]  D. Allard Geostatistical Classification and Class Kriging , 1998 .

[23]  Denis Marcotte,et al.  The multivariate (co)variogram as a spatial weighting function in classification methods , 1992 .

[24]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[25]  P. Monestiez,et al.  Geostatistical Segmentation of Rainfall Data , 1999 .

[26]  Mikhail Belkin,et al.  Consistency of spectral clustering , 2008, 0804.0678.

[27]  Y. Pawitan,et al.  Constrained clustering of irregularly sampled spatial data , 2003 .

[28]  T. C. Haas,et al.  Lognormal and Moving Window Methods of Estimating Acid Deposition , 1990 .