Identification of (near) Real-time Traffic Congestion in the Cities of Australia through Twitter

Transport congestion is an increasing problem especially for larger cities. Typically traffic conditions are monitored in Australia by state and/or federal authorities using expensive electronic devices/sensors on roads or through CCTV cameras. However there is an alternative and far cheaper way to monitor real-time traffic status on roads: through targeted social media analytics. Social networking sites such as Twitter are hugely popular, public and often real-time in nature. A growing number of people post tweets about their lives and feelings every day and everywhere, often with location-based service information included. In this paper, we present an architecture and novel harvesting and analytics approach that exploits this information to identify near real-time transport congestion. Specifically, we present an algorithm for targeted harvesting of tweets solely on the road network using the definitive road network data for Australia. We then implement spatial-temporal clustering algorithms to identify spatio-temporal clusters of tweets on roads to identify potential traffic congestion. We show the scalability of the solution through the use of the large-scale Cloud facilities offered through the National eResearch Collaboration Tools and Resources (NeCTAR -- www.nectar.org.au) Research Cloud.

[1]  Z QiuTony,et al.  Compatibility analysis of macroscopic and microscopic traffic simulation modeling , 2013 .

[2]  Anthony J. G. Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery [Point of View] , 2011 .

[3]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[4]  M. de Rijke,et al.  Predicting IMDB Movie Ratings Using Social Media , 2012, ECIR.

[5]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[6]  Ronen Feldman,et al.  Techniques and applications for sentiment analysis , 2013, CACM.

[7]  Fahad Bin Muhaya,et al.  Estimating Twitter User Location Using Social Interactions--A Content Based Approach , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[8]  Michael F. Goodchild,et al.  The convergence of GIS and social media: challenges for GIScience , 2011, Int. J. Geogr. Inf. Sci..

[9]  Vishal Gupta,et al.  A Survey on Sentiment Analysis and Opinion Mining Techniques , 2013 .

[10]  Michael Schreckenberg,et al.  A verifiable simulation model for real-world microscopic traffic simulations , 2014, Simul. Model. Pract. Theory.

[11]  Matthew S. Gerber,et al.  Predicting crime using Twitter and kernel density estimation , 2014, Decis. Support Syst..

[12]  Wasan Pattara-Atikom,et al.  Social-based traffic information extraction and classification , 2011, 2011 11th International Conference on ITS Telecommunications.

[13]  Lin Zhong,et al.  Human as Real-Time Sensors of Social and Physical Events: A Case Study of Twitter and Sports Games , 2011, ArXiv.

[14]  Graham Currie,et al.  Prediction intervals to account for uncertainties in neural network predictions: Methodology and application in bus travel time prediction , 2011, Eng. Appl. Artif. Intell..

[15]  Nick Koudas,et al.  TwitterMonitor: trend detection over the twitter stream , 2010, SIGMOD Conference.

[16]  Jessica Anderson,et al.  Traffic signal timing determination: the Cabal model , 1997 .

[17]  L G Willumsen,et al.  SATURN - A SIMULATION-ASSIGNMENT MODEL FOR THE EVALUATION OF TRAFFIC MANAGEMENT SCHEMES , 1980 .

[18]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[19]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[20]  P. Lowrie SCATS: Sydney Co-Ordinated Adaptive Traffic System: a traffic responsive method of controlling urban traffic , 1990 .

[21]  Keiji Yanai,et al.  Twitter Food Photo Mining and Analysis for One Hundred Kinds of Foods , 2014, PCM.

[22]  J. Chris Anderson,et al.  CouchDB - The Definitive Guide: Time to Relax , 2010 .

[23]  Slava Kisilevich,et al.  Spatio-temporal clustering , 2010, Data Mining and Knowledge Discovery Handbook.

[24]  Hidetsugu Nanba,et al.  Extracting Transportation Information and Traffic Problems from Tweets during a Disaster , 2012 .

[25]  Alberto Maria Segre,et al.  The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic , 2011, PloS one.

[26]  Axel Bruns,et al.  Tools and methods for capturing Twitter data during natural disasters , 2012, First Monday.

[27]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[28]  Jacob Ratkiewicz,et al.  Political Polarization on Twitter , 2011, ICWSM.

[29]  Sidharth Muralidharan,et al.  Hope for Haiti: An analysis of Facebook and Twitter usage during the earthquake relief efforts , 2011 .

[30]  Romain Billot,et al.  Microscopic cooperative traffic flow: calibration and simulation based on a next generation simulation dataset , 2014 .

[31]  Erwin Adi,et al.  Harvesting real time traffic information from Twitter , 2012 .

[32]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.