Real-Time Detection of Traffic From Twitter Stream Analysis

Social networks have been recently employed as a source of information for event detection, with particular reference to road traffic congestion and car accidents. In this paper, we present a real-time monitoring system for traffic event detection from Twitter stream analysis. The system fetches tweets from Twitter according to several search criteria; processes tweets, by applying text mining techniques; and finally performs the classification of tweets. The aim is to assign the appropriate class label to each tweet, as related to a traffic event or not. The traffic detection system was employed for real-time monitoring of several areas of the Italian road network, allowing for detection of traffic events almost in real time, often before online traffic news web sites. We employed the support vector machine as a classification model, and we achieved an accuracy value of 95.75% by solving a binary classification problem (traffic versus nontraffic tweets). We were also able to discriminate if traffic is caused by an external event or not, by solving a multiclass classification problem and obtaining an accuracy value of 88.89%.

[1]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[2]  Michael W. Berry,et al.  Survey of Text Mining , 2003, Springer New York.

[3]  Tong Zhang,et al.  Text Mining: Predictive Methods for Analyzing Unstructured Information , 2004 .

[4]  Laura Schweitzer,et al.  Advances In Kernel Methods Support Vector Learning , 2016 .

[5]  W. Nelson Francis,et al.  FREQUENCY ANALYSIS OF ENGLISH USAGE: LEXICON AND GRAMMAR , 1983 .

[6]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[7]  Francisco Herrera,et al.  A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms , 2011, Swarm Evol. Comput..

[8]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[9]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[10]  Yutaka Matsuo,et al.  Tweet Analysis for Real-Time Event Detection and Earthquake Reporting System Development , 2013, IEEE Transactions on Knowledge and Data Engineering.

[11]  Pedro M. d'Orey,et al.  ITS for Sustainable Mobility: A Survey on Applications and Impact Assessment Tools , 2014, IEEE Transactions on Intelligent Transportation Systems.

[12]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[13]  Gautam Shroff,et al.  Catching the Long-Tail: Extracting Local News Events from Twitter , 2012, ICWSM.

[14]  Luis Miguel Bergasa,et al.  Text Detection and Recognition on Traffic Panels From Street-Level Imagery Using Visual Appearance , 2014, IEEE Transactions on Intelligent Transportation Systems.

[15]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[16]  Harun Uguz,et al.  A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm , 2011, Knowl. Based Syst..

[17]  Yiannis Kompatsiaris,et al.  Sensing Trending Topics in Twitter , 2013, IEEE Transactions on Multimedia.

[18]  Max L. Wilson,et al.  Searching Twitter: Separating the Tweet from the Chaff , 2011, ICWSM.

[19]  Mehran Habibi Real World Regular Expressions with Java 1.4 , 2004 .

[20]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[21]  Axel Schulz,et al.  I See a Car Crash: Real-Time Detection of Small Scale Incidents in Microblogs , 2013, ESWC.

[22]  G. Eysenbach,et al.  Pandemics in the Age of Twitter: Content Analysis of Tweets during the 2009 H1N1 Outbreak , 2010, PloS one.

[23]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[24]  Keishi Tajima,et al.  Tweet classification based on their lifetime duration , 2012, CIKM.

[25]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[26]  Lawrence D. Fu,et al.  A comprehensive empirical comparison of modern supervised classification and feature selection methods for text categorization , 2014, J. Assoc. Inf. Sci. Technol..

[27]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[28]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[29]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[30]  Kamalakar Karlapalem,et al.  ET: events from tweets , 2013, WWW.

[31]  Bu-Sung Lee,et al.  Event Detection in Twitter , 2011, ICWSM.

[32]  Rui Li,et al.  TEDAS: A Twitter-based Event Detection and Analysis System , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[33]  D. Haussler,et al.  Boolean Feature Discovery in Empirical Learning , 1990, Machine Learning.

[34]  Dileeka Dias,et al.  An intelligent driver guidance tool using location based services , 2011, Proceedings 2011 IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services.

[35]  Wasan Pattara-Atikom,et al.  Social-based traffic information extraction and classification , 2011, 2011 11th International Conference on ITS Telecommunications.

[36]  Giuseppe Anastasi,et al.  Urban and social sensing for sustainable mobility in smart cities , 2013, 2013 Sustainable Internet and ICT for Sustainability (SustainIT).

[37]  Jie Yin,et al.  Using Social Media to Enhance Emergency Situation Awareness , 2012, IEEE Intelligent Systems.

[38]  Bertrand De Longueville,et al.  "OMG, from here, I can see the flames!": a use case of mining location based social networks to acquire spatio-temporal data on forest fires , 2009, LBSN '09.

[39]  Qingshan Jiang,et al.  Feature selection via maximizing global information gain for text classification , 2013, Knowl. Based Syst..

[40]  Andreas Hotho,et al.  A Brief Survey of Text Mining , 2005, LDV Forum.

[41]  Christian Rohrdantz,et al.  Getting there first : real-time detection of real-world incidents on Twitter , 2012 .

[42]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[43]  Matthew J. Barth,et al.  Eco-Routing Navigation System Based on Multisource Historical and Real-Time Traffic Information , 2012, IEEE Transactions on Intelligent Transportation Systems.

[44]  Wael Khreich,et al.  A Survey of Techniques for Event Detection in Twitter , 2015, Comput. Intell..

[45]  Gurpreet Singh Lehal,et al.  A Survey of Text Mining Techniques and Applications , 2009 .

[46]  Leena H. Patil,et al.  A novel feature selection based on information gain using WordNet , 2013, 2013 Science and Information Conference.

[47]  Bo Chen,et al.  A Review of the Applications of Agent Technology in Traffic and Transportation Systems , 2010, IEEE Transactions on Intelligent Transportation Systems.

[48]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[49]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[50]  Zhou Yao,et al.  Research on the Construction and Filter Method of Stop-word List in Text Preprocessing , 2011, 2011 Fourth International Conference on Intelligent Computation Technology and Automation.

[51]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[52]  Franco Zambonelli,et al.  Social sensors and pervasive services: Approaches and perspectives , 2011, 2011 IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops).

[53]  Raleigh North Haewoon, Kwak, Changhyun, Lee, Park, Hosung, and Moon, Sue. . What is Twitter, a Social Network or a News Media?. 19th International World Wide Web (WWW) Conference.April. , 2010 .

[54]  Hila Becker,et al.  Beyond Trending Topics: Real-World Event Identification on Twitter , 2011, ICWSM.

[55]  Geert-Jan Houben,et al.  Twitcident: fighting fire with information from social web streams , 2012, WWW.

[56]  Y. Matsuo,et al.  Real-time event extraction for driving information from social sensors , 2012, 2012 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER).