StreetNet: preference learning with convolutional neural network on urban crime perception

One can infer from the broken window theory that the perception of a city street's safety level relies significantly on the visual appearance of the street. Previous works have addressed the feasibility of using computer vision algorithms to classify urban scenes. Most of the existing urban perception predictions focus on binary outcomes such as safe or dangerous, wealthy or poor. However, binary predictions are not representative and cannot provide informative inferences such as the potential crime types in certain areas. In this paper, we explore the connection between urban perception and crime inferences. We propose a convolutional neural network (CNN) - StreetNet to learn crime rankings from street view images. The learning process is formulated on the basis of preference learning and label ranking settings. We design a street view images retrieval algorithm to improve the representation of urban perception. A data-driven, spatiotemporal algorithm is proposed to find unbiased label mappings between the street view images and the crime ranking records. Extensive evaluations conducted on images from different cities and comparisons with baselines demonstrate the effectiveness of our proposed method.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[3]  César A. Hidalgo,et al.  Cities Are Physical Too: Using Computer Vision to Measure the Quality and Impact of Urban Appearance , 2016 .

[4]  Liang Lin,et al.  Place-centric Visual Urban Perception with Deep Multi-instance Regression , 2017, ACM Multimedia.

[5]  Geoffrey P. Alpert,et al.  Critical Issues in Policing: Contemporary Readings , 1993 .

[6]  Chang-Tien Lu,et al.  Steds: Social Media Based Transportation Event Detection with Text Summarization , 2015, 2015 IEEE 18th International Conference on Intelligent Transportation Systems.

[7]  Ickjai Lee,et al.  URBAN CRIME ANALYSIS THROUGH AREAL CATEGORIZED MULTIVARIATE ASSOCIATIONS MINING , 2008, Appl. Artif. Intell..

[8]  Yong Luo,et al.  Vector-Valued Multi-View Semi-Supervsed Learning for Multi-Label Image Classification , 2013, AAAI.

[9]  Gang Chen,et al.  Semi-supervised Multi-label Learning by Solving a Sylvester Equation , 2008, SDM.

[10]  Richard Szeliski,et al.  City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Alexei A. Efros,et al.  City Forensics: Using Visual Elements to Predict Non-Visual City Attributes , 2014, IEEE Transactions on Visualization and Computer Graphics.

[12]  Ickjai Lee,et al.  Mining co-distribution patterns for large crime datasets , 2012, Expert Syst. Appl..

[13]  Yair Movshovitz-Attias,et al.  Ontological supervision for fine grained classification of Street View storefronts , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[15]  Virginia O. Andersson,et al.  Investigating Crime Rate Prediction Using Street-Level Images and Siamese Convolutional Neural Networks , 2017 .

[16]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[17]  Ramesh Raskar,et al.  Streetscore -- Predicting the Perceived Safety of One Million Streetscapes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[18]  Jonathan Krause,et al.  Fine-Grained Car Detection for Visual Census Estimation , 2017, AAAI.

[19]  Robert J. Sampson,et al.  Violent Crime and The Spatial Dynamics of Neighborhood Transition: Chicago, 1970–1990 , 1997 .

[20]  Jiebo Luo,et al.  Building a Large Scale Dataset for Image Emotion Recognition: The Fine Print and The Benchmark , 2016, AAAI.

[21]  Rita Cucchiara,et al.  Estimating Geospatial Trajectory of a Moving Camera , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[22]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[23]  Michael Luca,et al.  Big Data and Big Cities: The Promises and Limitations of Improved Measures of Urban Life , 2015 .

[24]  Vicente Ordonez,et al.  Learning High-Level Judgments of Urban Perception , 2014, ECCV.

[25]  Luc Van Gool,et al.  3D Urban Scene Modeling Integrating Recognition and Reconstruction , 2008, International Journal of Computer Vision.

[26]  Nicu Sebe,et al.  Are Safer Looking Neighborhoods More Lively?: A Multimodal Investigation into Urban Life , 2016, ACM Multimedia.

[27]  Ramesh Raskar,et al.  Deep Learning the City: Quantifying Urban Perception at a Global Scale , 2016, ECCV.

[28]  Jana Kosecka,et al.  Piecewise planar city 3D modeling from street view panoramic sequences , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Chen Xu,et al.  The SUN Attribute Database: Beyond Categories for Deeper Scene Understanding , 2014, International Journal of Computer Vision.

[30]  Chang-Tien Lu,et al.  Multi-Task Learning for Transit Service Disruption Detection , 2018, 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[31]  Dong Liu,et al.  Tag ranking , 2009, WWW '09.

[32]  Tieniu Tan,et al.  Simultaneous Feature and Sample Reduction for Image-Set Classification , 2016, AAAI.

[33]  Lorenzo Porzi,et al.  Predicting and Understanding Urban Perception with Convolutional Neural Networks , 2015, ACM Multimedia.

[34]  Chang-Tien Lu,et al.  A search and summary application for traffic events detection based on Twitter data , 2014, SIGSPATIAL/GIS.

[35]  Marcel Worring,et al.  Unsupervised multi-feature tag relevance learning for social image retrieval , 2010, CIVR '10.

[36]  W. Bernasco A SENTIMENTAL JOURNEY TO CRIME: EFFECTS OF RESIDENTIAL HISTORY ON CRIME LOCATION CHOICE* , 2010 .

[37]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[38]  Jonathan Krause,et al.  Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States , 2017, Proceedings of the National Academy of Sciences.

[39]  Chong-Wah Ngo,et al.  Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.

[40]  Mubarak Shah,et al.  Image Geo-Localization Based on MultipleNearest Neighbor Feature Matching UsingGeneralized Graphs , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Hermann Ney,et al.  Bag-of-visual-words models for adult image classification and filtering , 2008, 2008 19th International Conference on Pattern Recognition.

[42]  Mubarak Shah,et al.  Accurate Image Localization Based on Google Maps Street View , 2010, ECCV.

[43]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[44]  Ickjai Lee,et al.  Crime analysis through spatial areal aggregated density patterns , 2011, GeoInformatica.

[45]  Yaroslav Bulatov,et al.  Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks , 2013, ICLR.

[46]  Wim Bernasco,et al.  Modeling Micro-Level Crime Location Choice: Application of the Discrete Choice Framework to Crime at Places , 2010 .

[47]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[48]  L. M. Anderson,et al.  Perception of Personal Safety in Urban Recreation Sites , 1984 .