Mining Co-locations from Continuously Distributed Uncertain Spatial Data

A co-location pattern is a group of spatial features whose instances tend to locate together in geographic space. While traditional co-location mining focuses on discovering co-location patterns from deterministic spatial data sets, in this paper, we study the problem in the context of continuously distributed uncertain data. In particular, we aim to discover co-location patterns from uncertain spatial data where locations of spatial instances are represented as multivariate Gaussian distributions. We first formulate the problem of probabilistic co-location mining based on newly defined prevalence measures. When the locations of instances are represented as continuous variables, the major challenges of probabilistic co-location mining lie in the efficient computation of prevalence measures and the verification of the probabilistic neighborhood relationship between instances. We develop an effective probabilistic co-location mining framework integrated with optimization strategies to address the challenges. Our experiments on multiple datasets demonstrate the effectiveness of the proposed algorithm.

[1]  A. B. Slomson,et al.  How to count : an introduction to combinatorics , 2011 .

[2]  Lizhen Wang,et al.  Finding Probabilistic Prevalent Colocations in Spatially Uncertain Data Sets , 2013, IEEE Transactions on Knowledge and Data Engineering.

[3]  Wolfram Burgard,et al.  Probabilistic Robotics (Intelligent Robotics and Autonomous Agents) , 2005 .

[4]  Hui Xiong,et al.  A Framework for Discovering Co-Location Patterns in Data Sets with Extended Spatial Objects , 2004, SDM.

[5]  Yun Chi,et al.  Mining association rules with non-uniform privacy concerns , 2004, DMKD '04.

[6]  Hans-Peter Kriegel,et al.  Probabilistic Nearest Neighbor Queries on Uncertain Moving Object Trajectories , 2013, Proc. VLDB Endow..

[7]  Yan Huang,et al.  Mining Co-locations under Uncertainty , 2013, SSTD.

[8]  Yan Huang,et al.  Discovering Spatial Co-location Patterns: A Summary of Results , 2001, SSTD.

[9]  Hui Xiong,et al.  Discovering colocation patterns from spatial data sets: a general approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[10]  Jeffrey Xu Yu,et al.  Spatial Range Querying for Gaussian-Based Imprecise Query Objects , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[11]  Xin Zhang,et al.  Fast mining of spatial collocations , 2004, KDD.

[12]  Shashi Shekhar,et al.  A Joinless Approach for Mining Spatial Colocation Patterns , 2006, IEEE Transactions on Knowledge and Data Engineering.

[13]  Hans-Peter Kriegel,et al.  Probabilistic Frequent Pattern Growth for Itemset Mining in Uncertain Databases , 2010, SSDBM.

[14]  Yoshiharu Ishikawa,et al.  Processing Probabilistic Range Queries over Gaussian-Based Uncertain Data , 2013, SSTD.

[15]  Shashi Shekhar,et al.  A join-less approach for co-location pattern mining: a summary of results , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[16]  Hans-Peter Kriegel,et al.  Probabilistic frequent itemset mining in uncertain databases , 2009, KDD.