Crime in Philadelphia: Bayesian Clustering with Particle Optimization

Accurate estimation of the change in crime over time is a critical first step towards better understanding of public safety in large urban environments. Bayesian hierarchical modeling is a natural way to study spatial variation in urban crime dynamics at the neighborhood level, since it facilitates principled "sharing of information"S between spatially adjacent neighborhoods. Typically, however, cities contain many physical and social boundaries that may manifest as spatial discontinuities in crime patterns. In this situation, standard prior choices often yield overly-smooth parameter estimates, which can ultimately produce miscalibrated forecasts. To prevent potential over-smoothing, we introduce a prior that first partitions the set of neighborhoods into several clusters and then encourages spatial smoothness within each cluster. In terms of model implementation, conventional stochastic search techniques are computationally prohibitive, as they must traverse a combinatorially vast space of partitions. We introduce an ensemble optimization procedure that simultaneously identifies several high probability partitions by solving one optimization problem using a new local search strategy. We then use the identified partitions to estimate crime trends in Philadelphia between 2006 and 2017. On simulated and real data, our proposed method demonstrates good estimation and partition selection performance. Supplementary materials for this article are available online.

[1]  Craig Anderson,et al.  Spatial clustering of average risks and risk trends in Bayesian disease mapping , 2017, Biometrical journal. Biometrische Zeitschrift.

[2]  Shane T. Jensen,et al.  Spatial modeling of trends in crime over time in Philadelphia , 2019, The Annals of Applied Statistics.

[3]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[4]  Adrian E. Raftery,et al.  Bayesian Model Averaging: A Tutorial , 2016 .

[5]  P. Green,et al.  Bayesian Model-Based Clustering Procedures , 2007 .

[6]  George Casella,et al.  Cluster Analysis, Model Selection, and Prior Distributions on Models , 2014 .

[7]  Duncan Lee,et al.  Locally adaptive spatial smoothing using conditional auto‐regressive models , 2012, 1205.3641.

[8]  Bradley P. Carlin,et al.  Bayesian areal wombling for geographical boundary analysis , 2005 .

[9]  Bradley P. Carlin,et al.  Bayesian areal wombling using false discovery rates , 2012 .

[10]  D. B. Dahl Modal clustering in a class of product partition models , 2009 .

[11]  D G Denison,et al.  Bayesian Partitioning for Estimating Disease Risk , 2001, Biometrics.

[12]  Adrian E. Raftery,et al.  Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors , 1999 .

[13]  Zhen Zhang,et al.  Spatial regression and estimation of disease risks: A clustering-based approach , 2016, Stat. Anal. Data Min..

[14]  Pei Li,et al.  Mining boundary effects in areally referenced spatial data using the Bayesian information criterion , 2011, GeoInformatica.

[15]  B. Carlin,et al.  Bayesian areal wombling via adjacency modeling , 2007, Environmental and Ecological Statistics.

[16]  D. Clayton,et al.  Bayesian analysis of space-time variation in disease risk. , 1995, Statistics in medicine.

[17]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[18]  Radford M. Neal,et al.  A Split-Merge Markov chain Monte Carlo Procedure for the Dirichlet Process Mixture Model , 2004 .

[19]  B. Boots Using local statistics for boundary characterization , 2003 .

[20]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[21]  C. Robert,et al.  Computational and Inferential Difficulties with Mixture Posterior Distributions , 2000 .

[22]  A. Leslie Robb,et al.  Alternative Transformations to Handle Extreme Values of the Dependent Variable , 1988 .

[23]  Meagan M. Ehlenz Neighborhood Revitalization and the Anchor Institution , 2016 .

[24]  Veronika Ročková,et al.  Particle EM for Variable Selection , 2018, Journal of the American Statistical Association.

[25]  M. Escobar,et al.  Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[26]  D. Madigan,et al.  Bayesian Model Averaging for Linear Regression Models , 1997 .

[27]  Duncan Lee,et al.  Boundary detection in disease mapping studies. , 2011, Biostatistics.

[28]  L Knorr-Held,et al.  Bayesian Detection of Clusters and Discontinuities in Disease Maps , 2000, Biometrics.

[29]  Norman E. Breslow,et al.  Estimation of Disease Rates in Small Areas: A new Mixed Model for Spatial Dependence , 2000 .