Significant Route Discovery: A Summary of Results

Given a spatial network and a collection of activities (e.g., pedestrian fatality reports, crime reports), Significant Route Discovery (SRD) finds all shortest paths in the spatial network where the concentration of activities is unusually high (i.e., statistically significant). SRD is important for societal applications in transportation safety, public safety, or public health such as finding routes with significant concentrations of accidents, crimes, or diseases. SRD is challenging because 1) there are a potentially large number of candidate routes (~1016) in a given dataset with millions of activities or road network nodes and 2) significance testing does not obey the monotonicity property. Previous work focused on finding circular areas of concentration, limiting its usefulness for finding significant linear routes on a network. SaTScan may miss many significant routes since a large fraction of the area bounded by circles for activities on a path will be empty. This paper proposes a novel algorithm for discovering statistically significant routes. To improve performance, the proposed algorithm features algorithmic refinements that prune unlikely paths and speeds up Monte Carlo simulation. We present a case study comparing the proposed statistically significant network-based analysis (i.e., shortest paths) to a statistically significant geometry-based analysis (e.g., circles) on pedestrian fatality data. Experimental results on real data show that the proposed algorithm, with our algorithmic refinements, yields substantial computational savings without reducing result quality.

[1]  Tim Roughgarden,et al.  Single-Source Stochastic Routing , 2006, APPROX-RANDOM.

[2]  M. Kulldorff A spatial scan statistic , 1997 .

[3]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[4]  Andrew W. Moore,et al.  Rapid detection of significant spatial clusters , 2004, KDD.

[5]  Renato Assunção,et al.  A Simulated Annealing Strategy for the Detection of Arbitrarily Shaped Spatial Clusters , 2022 .

[6]  M. Kulldorff Spatial Scan Statistics: Models, Calculations, and Applications , 1999 .

[7]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[8]  Marcelo Azevedo Costa,et al.  Constrained spanning tree algorithms for irregularly-shaped spatial clustering , 2012, Comput. Stat. Data Anal..

[9]  Lei Shi,et al.  Anomalous Window Discovery for Linear Intersecting Paths , 2011, IEEE Transactions on Knowledge and Data Engineering.

[10]  Bettina Speckmann,et al.  Finding the Most Relevant Fragments in Networks , 2010, J. Graph Algorithms Appl..

[11]  Shashi Shekhar,et al.  A K-Main Routes Approach to Spatial Network Activity Summarization , 2010, IEEE Transactions on Knowledge and Data Engineering.

[12]  Jae-Gil Lee,et al.  Traffic Density-Based Discovery of Hot Routes in Road Networks , 2007, SSTD.

[13]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[14]  M. Kulldorff,et al.  Multivariate scan statistics for disease surveillance , 2007, Statistics in medicine.

[15]  Vijayalakshmi Atluri,et al.  LS3: a Linear Semantic Scan Statistic technique for detecting anomalous windows , 2005, SAC '05.

[16]  Lilly Shoup,et al.  Dangerous by Design: Solving the Epidemic of Preventable Pedestrian Deaths (and Making Great Neighborhoods) , 2009 .

[17]  Shashi Shekhar,et al.  CCAM: A Connectivity-Clustered Access Method for Networks and Network Computations , 1997, IEEE Trans. Knowl. Data Eng..

[18]  Chengyang Zhang,et al.  Advances in Spatial and Temporal Databases , 2015, Lecture Notes in Computer Science.