City Forensics: Using Visual Elements to Predict Non-Visual City Attributes

We present a method for automatically identifying and validating predictive relationships between the visual appearance of a city and its non-visual attributes (e.g. crime statistics, housing prices, population density etc.). Given a set of street-level images and (location, city-attribute-value) pairs of measurements, we first identify visual elements in the images that are discriminative of the attribute. We then train a predictor by learning a set of weights over these elements using non-linear Support Vector Regression. To perform these operations efficiently, we implement a scalable distributed processing framework that speeds up the main computational bottleneck (extracting visual elements) by an order of magnitude. This speedup allows us to investigate a variety of city attributes across 6 different American cities. We find that indeed there is a predictive relationship between visual elements and a number of city attributes including violent crime rates, theft rates, housing prices, population density, tree presence, graffiti presence, and the perception of danger. We also test human performance for predicting theft based on street-level images and show that our predictor outperforms this baseline with 33% higher accuracy on average. Finally, we present three prototype applications that use our system to (1) define the visual boundary of city neighborhoods, (2) generate walking directions that avoid or seek out exposure to city attributes, and (3) validate user-specified visual elements for prediction.

[1]  Jitendra Malik,et al.  Robust Multiple Car Tracking with Occlusion Reasoning , 1994, ECCV.

[2]  Martial Hebert,et al.  Data-Driven 3D Primitives for Single Image Understanding , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  César A. Hidalgo,et al.  The Collaborative Image of The City: Mapping the Inequality of Urban Perception , 2013, PloS one.

[4]  Tieniu Tan,et al.  A survey on visual surveillance of object motion and behaviors , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[5]  Azriel Rosenfeld,et al.  Tracking Groups of People , 2000, Comput. Vis. Image Underst..

[6]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  J. Wildgen,et al.  "Broken windows" and the risk of gonorrhea. , 2000, American journal of public health.

[8]  C. Lavalle,et al.  Modelling dynamic spatial processes: simulation of urban future scenarios through cellular automata , 2003 .

[9]  Byoungkwon An,et al.  Looking Beyond the Visible Scene , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Suman Srinivasan,et al.  Airborne traffic surveillance systems: video surveillance of highway traffic , 2004, VSSN '04.

[11]  Anne Ellaway,et al.  Graffiti, greenery, and obesity in adults: secondary analysis of European cross sectional survey , 2005, BMJ : British Medical Journal.

[12]  C. Briese,et al.  A NEW METHOD FOR BUILDING EXTRACTION IN URBAN AREAS FROM HIGH-RESOLUTION LIDAR DATA , 2002 .

[13]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[15]  Michael Elad,et al.  Sparse Representation for Color Image Restoration , 2008, IEEE Transactions on Image Processing.

[16]  AgarwalAnkur,et al.  Recovering 3D Human Pose from Monocular Images , 2006 .

[17]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[19]  Vicente Ordonez,et al.  Learning High-Level Judgments of Urban Perception , 2014, ECCV.

[20]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[21]  Viktor Mayer-Schnberger,et al.  Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2013 .

[22]  Alexei A. Efros,et al.  What makes Paris look like Paris? , 2015, Commun. ACM.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Aleksey Boyko,et al.  Extracting roads from dense point clouds in large scale urban environment , 2011 .

[25]  Avideh Zakhor,et al.  Tree Detection in Urban Regions Using Aerial Lidar and Image Data , 2007, IEEE Geoscience and Remote Sensing Letters.

[26]  Paul M. Torrens,et al.  Modeling gentrification dynamics: A hybrid approach , 2007, Comput. Environ. Urban Syst..

[27]  Avideh Zakhor,et al.  Classifying urban landscape in aerial LiDAR using 3D shape analysis , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[28]  Alan L. Yuille,et al.  Guest Editorial: Statistical and Computational Theories of Vision: Modeling, Learning, Sampling and Computing, Part I , 2000, International Journal of Computer Vision.

[29]  Anthony Townsend,et al.  Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia , 2013 .

[30]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[31]  C. Micchelli Interpolation of scattered data: Distance matrices and conditionally positive definite functions , 1986 .

[32]  J. Figueira-Mcdonough : Disorder and Decline: Crime and the Spiral of Decay in American Neighborhoods , 1992 .

[33]  Yong Jae Lee,et al.  Style-Aware Mid-level Representation for Discovering Visual Connections in Space and Time , 2013, 2013 IEEE International Conference on Computer Vision.

[34]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[35]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[36]  Luc Van Gool,et al.  Procedural modeling of buildings , 2006, SIGGRAPH 2006.

[37]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[38]  Eric Gossett,et al.  Big Data: A Revolution That Will Transform How We Live, Work, and Think , 2015 .

[39]  Henriette Cramer,et al.  Aesthetic capital: what makes london look beautiful, quiet, and happy? , 2014, CSCW.

[40]  Peter Wonka,et al.  What Makes London Work Like London? , 2014, Comput. Graph. Forum.

[41]  Erik Wilde,et al.  Mapping the World . . . One Neighborhood at a Time , 2008 .

[42]  Alexei A. Efros,et al.  Unsupervised Discovery of Mid-Level Discriminative Patches , 2012, ECCV.

[43]  Ramesh Raskar,et al.  Streetscore -- Predicting the Perceived Safety of One Million Streetscapes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[44]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[45]  Bolei Zhou,et al.  Recognizing City Identity via Attribute Analysis of Geo-tagged Images , 2014, ECCV.

[46]  Carl A Latkin,et al.  Stressful neighborhoods and depression: a prospective study of the impact of neighborhood disorder. , 2003, Journal of health and social behavior.

[47]  Luc Van Gool,et al.  Image-based procedural modeling of facades , 2007, ACM Trans. Graph..