SSCP: Mining Statistically Significant Co-location Patterns

Co-location pattern discovery searches for subsets of spatial features whose instances are often located at close spatial proximity. Current algorithms using user specified thresholds for prevalence measures may report co-locations even if the features are randomly distributed. In our model, we look for subsets of spatial features which are co-located due to some form of spatial dependency but not by chance. We first introduce a new definition of co-location patterns based on a statistical test. Then we propose an algorithm for finding such co-location patterns where we adopt two strategies to reduce computational cost compared to a naive approach based on simulations of the data distribution. We propose a pruning strategy for computing the prevalence measures. We also show that instead of generating all instances of an auto-correlated feature during a simulation, we could generate a reduced number of instances for the prevalence measure computation. We evaluate our algorithm empirically using synthetic and real data and compare our findings with the results found in a state-of-the-art co-location mining algorithm.

[1]  Xing Xie,et al.  Density based co-location pattern discovery , 2008, GIS '08.

[2]  Shashi Shekhar,et al.  Mixed-Drove Spatiotemporal Co-Occurrence Pattern Mining , 2008, IEEE Transactions on Knowledge and Data Engineering.

[3]  Shashi Shekhar,et al.  Discovery of co-evolving spatial event sets , 2006 .

[4]  D. Stoyan,et al.  Statistical Analysis and Modelling of Spatial Point Patterns , 2008 .

[5]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[6]  P. Diggle,et al.  Monte Carlo Methods of Inference for Implicit Statistical Models , 1984 .

[7]  Chengyang Zhang,et al.  Advances in Spatial and Temporal Databases , 2015, Lecture Notes in Computer Science.

[8]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[9]  Shashi Shekhar,et al.  A Joinless Approach for Mining Spatial Colocation Patterns , 2006, IEEE Transactions on Knowledge and Data Engineering.

[10]  Yan Huang,et al.  Discovering Spatial Co-location Patterns: A Summary of Results , 2001, SSTD.

[11]  Hui Xiong,et al.  Discovering colocation patterns from spatial data sets: a general approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[12]  Jiawei Han,et al.  Discovery of Spatial Association Rules in Geographic Information Databases , 1995, SSD.

[13]  Valerie Isham,et al.  A Bivariate Spatial Point Pattern of Ants' Nests , 1983 .

[14]  Yasuhiko Morimoto,et al.  Mining frequent neighboring class sets in spatial databases , 2001, KDD '01.

[15]  Max J. Egenhofer,et al.  Advances in Spatial Databases , 1997, Lecture Notes in Computer Science.

[16]  B. Ripley The Second-Order Analysis of Stationary Point Processes , 1976 .

[17]  Noel A Cressie,et al.  Statistics for Spatial Data. , 1992 .

[18]  Shashi Shekhar,et al.  Spatial clustering of chimpanzee locations for neighborhood identification , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[19]  Shashi Shekhar,et al.  A partial join approach for mining co-location patterns , 2004, GIS '04.