Mining statistically sound co-location patterns at multiple distances

Existing co-location mining algorithms require a user provided distance threshold at which prevalent patterns are searched. Since spatial interactions, in reality, may happen at different distances, finding the right distance threshold to mine all true patterns is not easy and a single appropriate threshold may not even exist. A standard co-location mining algorithm also requires a prevalence measure threshold to find prevalent patterns. The prevalence measure values of the true co-location patterns occurring at different distances may vary and finding a prevalence measure threshold to mine all true patterns without reporting random patterns is not easy and sometimes not even possible. In this paper, we propose an algorithm to mine true co-location patterns at multiple distances. Our approach is based on a statistical test and does not require thresholds for the prevalence measure and the interaction distance. We evaluate the efficacy of our algorithm using synthetic and real data sets comparing it with the state-of-the-art co-location mining approach.

[1]  M. Hutchings,et al.  Standing crop and pattern in pure stands of Mercurialis perennis and Rubus fruticosus in mixed deciduous woodland , 1978 .

[2]  Shashi Shekhar,et al.  A Joinless Approach for Mining Spatial Colocation Patterns , 2006, IEEE Transactions on Knowledge and Data Engineering.

[3]  Klaus von Gadow,et al.  An analysis of spatial forest structure using neighbourhood-based variables , 2003 .

[4]  Peter J. Diggle,et al.  Simple Monte Carlo Tests for Spatial Pattern , 1977 .

[5]  Jiawei Han,et al.  Discovery of Spatial Association Rules in Geographic Information Databases , 1995, SSD.

[6]  J. Symanzik Statistical Analysis of Spatial Point Patterns (2nd ed.) , 2005 .

[7]  Yasuhiko Morimoto,et al.  Mining frequent neighboring class sets in spatial databases , 2001, KDD '01.

[8]  Jörg Sander,et al.  SSCP: Mining Statistically Significant Co-location Patterns , 2011, SSTD.

[9]  J. Neyman,et al.  Statistical Approach to Problems of Cosmology , 1958 .

[10]  Yan Huang,et al.  Discovering Spatial Co-location Patterns: A Summary of Results , 2001, SSTD.

[11]  Jürgen Symanzik,et al.  Statistical Analysis of Spatial Point Patterns , 2005, Technometrics.

[12]  Feng Qian,et al.  Mining Spatial Co-location Patterns with Dynamic Neighborhood Constraint , 2009, ECML/PKDD.

[13]  Sami Hanhijärvi Multiple Hypothesis Testing in Pattern Discovery , 2011, Discovery Science.

[14]  Xing Xie,et al.  Density based co-location pattern discovery , 2008, GIS '08.

[15]  P. Diggle,et al.  Monte Carlo Methods of Inference for Implicit Statistical Models , 1984 .

[16]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[17]  Jörg Sander,et al.  Mining Statistically Significant Co-location and Segregation Patterns , 2014, IEEE Transactions on Knowledge and Data Engineering.

[18]  Stephen H. Roxburgh,et al.  The statistical validation of null models used in spatial association analyses , 1999 .

[19]  D. Stoyan,et al.  Statistical Analysis and Modelling of Spatial Point Patterns , 2008 .

[20]  Jiawei Han,et al.  GeoMiner: a system prototype for spatial data mining , 1997, SIGMOD '97.

[21]  Marymegan Daly,et al.  Evolution of sea anemones (Cnidaria: Actiniaria: Hormathiidae) symbiotic with hermit crabs. , 2010, Molecular phylogenetics and evolution.

[22]  Shashi Shekhar,et al.  A partial join approach for mining co-location patterns , 2004, GIS '04.

[23]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[24]  Adrian Baddeley,et al.  Spatial Point Processes and their Applications , 2007 .

[25]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[26]  Adrian Baddeley,et al.  Multivariate and marked point processes , 2010 .

[27]  Shuji Tsukiyama,et al.  A New Algorithm for Generating All the Maximal Independent Sets , 1977, SIAM J. Comput..

[28]  Hui Xiong,et al.  Discovering colocation patterns from spatial data sets: a general approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[29]  Jin Soung Yoo,et al.  Mining spatial colocation patterns: a different framework , 2011, Data Mining and Knowledge Discovery.