Profiling and Prediction of Non-Emergency Calls in New York City

Non-emergency calls, namely the 311 calls, capture different complaints of city residents and visitors about a variety of experienced problems in a city. The 311 calls in New York City (NYC) are publicly available and can provide an interesting status of the city. In this paper, we share a summary of an extensive analysis that we are performing in the 311 data of NYC, as well as a data-based prediction of the number of 311 calls. We present information about the 311 data files and content along multiple dimensions, and then proceed to present prediction results, in which we show that several semantic features affect the different types of complaints differently. Introduction Several cities, New York City in particular for this paper, have a 311 24-hour hot line and online service, which allows anyone, residents and tourists, to report a non-emergency problem. Reported 311 problems are passed along to government services, who address and solve the problem. The records of 311 calls are publicly open and updated daily. Analysis of 311 calls can clearly be of great use for a wide variety of purposes, ranging from a rich understanding of the status of a city to the effectiveness of the government services in addressing such calls. Ideally, the analysis can also support a prediction of future 311 calls, which would enable the assignment of service resources by the city government. We have been extensively analyzing 311 calls in NYC. In this paper, we profile the data set and highlight a few interesting facts. We provide statistics along complaint types, geolocation, and temporal patterns and show the diversity of the big 311 data along those dimensions. We then discuss the prediction problem of number of calls, where we experiment with different sets of semantic features. We show that the prediction error for different complaint types can significantly vary if some features are not considered. We believe that the 311 data offer a compelling source to understand how cities work, what is influencing them, and Copyright c © 2014, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. the underlying relationships between different environments and temporal information to the reported incidents. Profiling 311 Request Data The 311 service request data is publicly available from the NYC Open Data Portal. It has been updated daily since year 2010. In our study, we use data from four complete years in the period of Jan 1, 2010 to Dec 31, 2013. These data come in a table with 52 columns and 6,588,519 rows, where each row is a record of the request and each column refers to one descriptor of the record. Basically, the 52 descriptors of the 311 data can be classified into five main categories: Time: Descriptors of important time points of requests. There are 4 of them: Created Date, Closed Date, Due Date, and Resolution Action Updated Date. Location: Descriptors related to the geo-location of the requests. There are 31 descriptors in the category. Many of the descriptors provide redundant information about locations of requests, because they are designed only for some certain types of requests. Examples are Incident Zip,Incident Address, X Coordinate (State Plane), and Y Coordinate (State Plane). Type: Semantic descriptors of the requests. There are 2 members in this category, Complaint Type and Descriptor. The Complaint Type contains categories of the requests according to its content, while the information in Descriptor are more detailed subcategories inside Complaint Types. Agency: Descriptors indicating which agency handled the request. There are 2 of them: Agency and Agency Names. The Agency is basically the abbreviation of the Agency Name. Other: 13 other varied descriptors including Unique Key, Status,Facility Type, Garage Lot Name. Like the Location, many of the descriptors are designed to support only a few types of requests. Semantic Cities: Beyond Open Data to Models, Standards and Reasoning: Papers from the AAAI-14 Workshop