Public decision support for low population density areas: An imbalance-aware hyper-ensemble for spatio-temporal crime prediction

Abstract Crime events are known to reveal spatio-temporal patterns, which can be used for predictive modeling and subsequent decision support. While the focus has hitherto been placed on areas with high population density, we address the challenging undertaking of predicting crime hotspots in regions with low population densities and highly unequally-distributed crime. This results in a severe sparsity (i. e., class imbalance) of the outcome variable, which impedes predictive modeling. To alleviate this, we develop machine learning models for spatio-temporal prediction that are specifically adjusted for an imbalanced distribution of the class labels and test them in an actual setting with state-of-the-art predictors (i. e., socio-economic, geographical, temporal, meteorological, and crime variables in fine resolution). The proposed imbalance-aware hyper-ensemble increases the hit ratio considerably from 18.1% to 24.6% when aiming for the top 5% of hotspots, and from 53.1% to 60.4% when aiming for the top 20% of hotspots. As direct implications, the findings help decision-makers in law enforcement and contribute to public decision support in low population density regions.

[1]  Ellen G. Cohn,et al.  Even criminals take a holiday: Instrumental and expressive crimes on major and minor holidays , 2003 .

[2]  Benjamín Barán,et al.  An open-data approach for quantifying the potential of taxi ridesharing , 2017, Decis. Support Syst..

[3]  Tong Wang,et al.  Learning to Detect Patterns of Crime , 2013, ECML/PKDD.

[4]  Xiaofeng Wang,et al.  The spatio-temporal modeling for criminal incidents , 2012, Security Informatics.

[5]  Stefan Lessmann,et al.  Improving crime count forecasts using Twitter and taxi data , 2018, Decis. Support Syst..

[6]  P. Brantingham,et al.  Criminality of place , 1995 .

[7]  Andrea L. Bertozzi,et al.  Randomized Controlled Field Trials of Predictive Policing , 2015 .

[8]  Ken Pease,et al.  Once bitten, twice bitten: repeat victimisation and its implications for crime prevention , 1993 .

[9]  R. Hirsch Validation samples. , 1991, Biometrics.

[10]  Felix Wortmann,et al.  Preventing traffic accidents with in-vehicle decision support systems - The impact of accident hotspot warnings on driver behaviour , 2017, Decis. Support Syst..

[11]  Lisa Tompson,et al.  The Utility of Hotspot Mapping for Predicting Spatial Patterns of Crime , 2008 .

[12]  Carter C. Price,et al.  Predictive Policing: The Role of Crime Forecasting in Law Enforcement Operations , 2013 .

[13]  L. Pauwels,et al.  The use of predictive analysis in spatiotemporal crime forecasting: Building and testing a model in an urban context , 2017 .

[14]  Alex Pentland,et al.  Once Upon a Crime: Towards Crime Prediction from Demographics and Mobile Data , 2014, ICMI.

[15]  Howard Giles,et al.  Fairness and Effectiveness in Policing: The Evidence , 2005 .

[16]  Raquel Rosés Brüngger,et al.  Measuring Ambient Population from Location-Based Social Networks to Describe Urban Crime , 2017, SocInfo.

[17]  Eric Séverin,et al.  An investigation of bankruptcy prediction in imbalanced datasets , 2018, Decis. Support Syst..

[18]  Yue-Shi Lee,et al.  Under-Sampling Approaches for Improving Prediction of the Minority Class in an Imbalanced Dataset , 2006 .

[19]  H. D. McKay,et al.  Juvenile Delinquency and Urban Areas. , 1943 .

[20]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[21]  George E. Tita,et al.  Self-Exciting Point Process Modeling of Crime , 2011 .

[22]  M. Scholar,et al.  Effects of Weather on Crime , 2013 .

[23]  James B. Pick,et al.  Smart cities in the United States and worldwide: A rich arena for MIS studies , 2017 .

[24]  Julian J. Faraway,et al.  Does data splitting improve prediction? , 2013, Stat. Comput..

[25]  Ozgur Turetken,et al.  Location analytics and decision support: Reflections on recent advancements, a research framework, and the path ahead , 2017, Decis. Support Syst..

[26]  David V. Canter,et al.  Predicting Serial Killers' Home Base Using a Decision Support System , 2000 .

[27]  Dursun Delen,et al.  A synthetic informative minority over-sampling (SIMO) algorithm leveraging support vector machine to enhance learning from imbalanced datasets , 2018, Decis. Support Syst..

[28]  Donald E. Brown,et al.  Spatial analysis with preference specification of latent decision makers for criminal event prediction , 2006, Decis. Support Syst..

[29]  Bruce J. Doran,et al.  Why Is Fear of Crime a Serious Social Problem , 2012 .

[30]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[31]  Shane D. Johnson Repeat burglary victimisation: a tale of two theories , 2008 .

[32]  Cristina Kadar,et al.  Mining large-scale human mobility data for long-term crime prediction , 2018, EPJ Data Science.

[33]  Tao Cheng,et al.  Novel evaluation metrics for sparse spatio-temporal point process hotspot predictions - a crime case study , 2016, Int. J. Geogr. Inf. Sci..

[34]  Dirk Neumann,et al.  Moving in time and space - Location intelligence for carsharing decision support , 2017, Decis. Support Syst..

[35]  Daniel Kifer,et al.  Crime Rate Inference with Big Data , 2016, KDD.

[36]  Ken Pease,et al.  Prospective hot-spotting - The future of crime mapping? , 2004 .

[37]  Matthew S. Gerber,et al.  Predicting crime using Twitter and kernel density estimation , 2014, Decis. Support Syst..