Discovering and Quantifying Mean Streets : A Summary of Results ∗

Mean streets represent those connected subsets of a spatial network whose attribute values are significantly higher than expected. Discovering and quantifying mean streets is an important problem with many applications such as detecting high-crime-density streets and high crash roads (or areas) for public safety, detecting urban cancer disease clusters for public health, detecting human activity patterns in asymmetric warfare scenarios, and detecting urban activity centers for consumer applications. However, discovering and quantifying mean streets in large spatial networks is computationally very expensive due to the difficulty of characterizing and enumerating the population of streets to define a norm or expected activity level. Previous work either focuses on statistical rigor at the cost of computational exorbitance, or concentrates on computational efficiency without addressing any statistical interpretation of algorithms. In contrast, this paper explores computationally efficient algorithms for use on statistically interpretable results. We describe alternative ways of defining and efficiently enumerating instances of subgraph families such as paths. We also use statistical models such as the Poisson distribution and the sum of independent Poisson distributions to provide interpretations for results. We define the problem of discovering and quantifying mean streets and propose a novel mean streets mining algorithm. Experimental evaluations using synthetic and real-world datasets show that the proposed method is computationally more efficient than näıve alternatives.

[1]  J. Grandell Mixed Poisson Processes , 1997 .

[2]  Atsuyuki Okabe,et al.  The SANET Toolbox: New Methods for Network Spatial Analysis , 2006, Trans. GIS.

[3]  P. Longley,et al.  Spatial analysis: Modelling in a GIS environment , 1996 .

[4]  S. A. Roach,et al.  The Theory of Random Clumping , 1968 .

[5]  Franklin A. Graybill,et al.  Introduction to The theory , 1974 .

[6]  Narsingh Deo,et al.  On Algorithms for Enumerating All Circuits of a Graph , 1976, SIAM J. Comput..

[7]  D. Torrieri,et al.  Algorithms for finding an optimal set of short disjoint paths in a communication network , 1991, MILCOM 91 - Conference record.

[8]  Jerry H. Ratcliffe,et al.  The Hotspot Matrix: A Framework for the Spatio‐Temporal Targeting of Crime Reduction , 2004 .

[9]  K. Parthasarathy Enumeration of paths in digraphs , 1964 .

[10]  Frank L. Rubin,et al.  Enumerating all simple paths in a graph , 1978 .

[11]  Robert E. Tarjan,et al.  Enumeration of the Elementary Circuits of a Directed Graph , 1972, SIAM J. Comput..

[12]  H. Sherali,et al.  A Branch-and-Cut Algorithm for Solving an Intraring Synchronous Optical Network Design Problem , 2000 .

[13]  T. Kamae,et al.  A Systematic Method of Finding All Directed Circuits and Enumerating All DIrected Paths , 1967, IEEE Transactions on Circuit Theory.

[14]  Shashi Shekhar,et al.  Spatio-temporal Network Databases and Routing Algorithms: A Summary of Results , 2007, SSTD.

[15]  Carolyn Pillers Dobler,et al.  Mathematical Statistics , 2002 .

[16]  Jae-Gil Lee,et al.  Traffic Density-Based Discovery of Hot Routes in Road Networks , 2007, SSTD.

[17]  Robert E. Tarjan,et al.  A quick method for finding shortest pairs of disjoint paths , 1984, Networks.

[18]  M. Evans,et al.  Statistical Distributions, Third Edition , 2001 .

[19]  Robert W. Floyd,et al.  Nondeterministic Algorithms , 1967, JACM.

[20]  Dorit S. Hochbaum,et al.  The SONET edge-partition problem , 2003, Networks.

[21]  Shashi Shekhar,et al.  Time-Aggregated Graphs for Modeling Spatio-temporal Networks , 2006, J. Data Semant..

[22]  Laurence A. Wolsey,et al.  Optimal Placement of Add/Drop Multiplexers: Heuristic and Exact Algorithms , 1998, Oper. Res..

[23]  Andrew V. Goldberg,et al.  Shortest paths algorithms: Theory and experimental evaluation , 1994, SODA '94.

[24]  J. Chambers A Concise Course in Advanced Level Statistics , 1984 .

[25]  Y. S. Sathe Introduction to the Theory of Statistics , 1965 .

[26]  Sanjeev Khanna,et al.  Edge disjoint paths revisited , 2003, SODA '03.