Efficient Reachability Query Evaluation in Large Spatiotemporal Contact Datasets

With the advent of reliable positioning technologies and prevalence of location-based services, it is now feasible to accurately study the propagation of items such as infectious viruses, sensitive information pieces, and malwares through a population of moving objects, e.g., individuals, mobile devices, and vehicles. In such application scenarios, an item passes between two objects when the objects are sufficiently close (i.e., when they are, so-called, in contact), and hence once an item is initiated, it can penetrate the object population through the evolving network of contacts among objects, termed contact network. In this paper, for the first time we define and study reachability queries in large (i.e., disk-resident) contact datasets which record the movement of a (potentially large) set of objects moving in a spatial environment over an extended time period. A reachability query verifies whether two objects are "reachable" through the evolving contact network represented by such contact datasets. We propose two contact-dataset indexes that enable efficient evaluation of such queries despite the potentially humongous size of the contact datasets. With the first index, termed ReachGrid, at the query time only a small necessary portion of the contact network which is required for reachability evaluation is constructed and traversed. With the second approach, termed ReachGraph, we precompute reachability at different scales and leverage these precalculations at the query time for efficient query processing. We optimize the placement of both indexes on disk to enable efficient index traversal during query processing. We study the pros and cons of our proposed approaches by performing extensive experiments with both real and synthetic data. Based on our experimental results, our proposed approaches outperform existing reachability query processing techniques in contact networks by 76% on average.

[1]  Jean-Yves Le Boudec,et al.  Power Law and Exponential Decay of Intercontact Times between Mobile Devices , 2007, IEEE Transactions on Mobile Computing.

[2]  Kurt Mehlhorn,et al.  External-Memory Breadth-First Search with Sublinear I/O , 2002, ESA.

[3]  Mohammed J. Zaki,et al.  GRAIL , 2010, Proc. VLDB Endow..

[4]  Ulrich Meyer,et al.  Heuristics for semi-external depth first search on directed graphs , 2002, SPAA '02.

[5]  Farnoush Banaei Kashani,et al.  Online Computation of Fastest Path in Time-Dependent Spatial Networks , 2011, SSTD.

[6]  Shashi Shekhar,et al.  A Lagrangian approach for storage of spatio-temporal network datasets: a summary of results , 2010, GIS '10.

[7]  Philipp Sommer,et al.  Generic mobility simulation framework (GMSF) , 2008, MobilityModels '08.

[8]  Hanan Samet,et al.  Scalable network distance browsing in spatial databases , 2008, SIGMOD Conference.

[9]  Marios Hadjieleftheriou,et al.  Efficient trajectory joins using symbolic representations , 2005, MDM '05.

[10]  Jignesh M. Patel,et al.  Indexing Large Trajectory Data Sets With SETI , 2003, CIDR.

[11]  Pan Hui,et al.  BUBBLE Rap: Social-Based Forwarding in Delay-Tolerant Networks , 2008, IEEE Transactions on Mobile Computing.

[12]  Chris Jermaine,et al.  Closest-Point-of-Approach Join for Moving Object Histories , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[13]  Jeffrey Xu Yu,et al.  Graph Reachability Queries: A Survey , 2010, Managing and Mining Graph Data.

[14]  Cecilia Mascolo,et al.  Characterising temporal distance and reachability in mobile and online social networks , 2010, CCRV.

[15]  Jeffrey Scott Vitter,et al.  Algorithms and Data Structures for External Memory , 2008, Found. Trends Theor. Comput. Sci..

[16]  Thomas Brinkhoff,et al.  Generating Traffic Data , 2003, IEEE Data Eng. Bull..

[17]  Yannis Manolopoulos,et al.  Multi-Way Distance Join Queries in Spatial Databases , 2004, GeoInformatica.

[18]  Kartik Gopalan,et al.  Modeling vanet deployment in urban settings , 2007, MSWiM '07.