A methodology for traffic-related Twitter messages interpretation

HighlightsThis paper focuses on the problem of interpreting tweets that describe traffic-related events.We introduce a traffic event domain ontology, called TEDO, that models traffic-related events.We describe a new tool to automatically interpret traffic-related tweets.This tool translates each tweet into a set of RDF triples structured according to TEDO. This paper addresses the problem of interpreting tweets that describe traffic-related events and that are distributed by government agencies in charge of road networks or by news agencies. Processing such tweets is of interest for two reasons. First, albeit phrased in natural language, such tweets use a much more regular and well-behaved prose than generic user-generated tweets. This characteristic facilitates automating their interpretation and achieving high precision and recall. Second, government agencies and news agencies use Twitter channels to distribute real-time traffic conditions and to alert drivers about planned changes on the road network and about future events that may affect traffic conditions. Hence, such tweets provide exactly the kind of information that proactive truck fleet monitoring and similar applications require. The main contribution of the paper is an automatic tweet interpretation tool, based on Machine Learning techniques, that achieves good performance for traffic-related tweets distributed by traffic authorities and news agencies. The paper also covers in detail experiments with real traffic-related tweets to access the precision and recall of the tool.

[1]  Eraldo Rezende Fernandes,et al.  Latent Structure Perceptron with Feature Induction for Unrestricted Coreference Resolution , 2012, EMNLP-CoNLL Shared Task.

[2]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[3]  Ken Fukuda,et al.  An Upper Ontology for Event Classifications and Relations , 2007, Australian Conference on Artificial Intelligence.

[4]  Michael F. Worboys,et al.  From Objects to Events: GEM, the Geospatial Event Model , 2004, GIScience.

[5]  Jian Su,et al.  Named Entity Recognition using an HMM-based Chunk Tagger , 2002, ACL.

[6]  Claudia Bauzer Medeiros,et al.  Discovering geographic locations in web pages using urban addresses , 2007, GIR '07.

[7]  James R. Curran,et al.  Language Independent NER using a Maximum Entropy Tagger , 2003, CoNLL.

[8]  Alexander Borgida,et al.  Conceptual Modeling with Description Logics , 2003, Description Logic Handbook.

[9]  Rittwik Jana,et al.  Geotracker: geospatial and temporal RSS navigation , 2007, WWW '07.

[10]  B. Hammond Ontology , 2004, Lawrence Booth’s Book of Visions.

[11]  Anuj R. Jaiswal,et al.  Analytics : Applications in Crisis Management , 2011 .

[12]  Jason J. Jung Online named entity recognition method for microtexts in social networking services: A case study of twitter , 2012, Expert Syst. Appl..

[13]  Gregoris Mentzas,et al.  Proactive Situation Management in the Future Internet: The Case of the Smart Power Grid , 2011, 2011 22nd International Workshop on Database and Expert Systems Applications.

[14]  Maribel Yasmina Santos,et al.  GUESS: On the Prediction of Mobile Users’ Movement in Space , 2010 .

[15]  Jing Wang,et al.  An Ontology-Based Traffic Accident Risk Mapping Framework , 2011, SSTD.

[16]  Marco A. Casanova,et al.  A Proactive Application to Monitor Truck Fleets , 2013, 2013 IEEE 14th International Conference on Mobile Data Management.

[17]  Daniel S. Weld,et al.  Automatically refining the wikipedia infobox ontology , 2008, WWW.

[18]  David L. Tennenhouse,et al.  Proactive computing , 2000, Commun. ACM.

[19]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[20]  Oren Etzioni,et al.  Named Entity Recognition in Tweets: An Experimental Study , 2011, EMNLP.

[21]  Dongli Yue,et al.  Traffic Accidents Knowledge Management Based on Ontology , 2009, 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery.

[22]  Daniel S. Weld,et al.  Autonomously semantifying wikipedia , 2007, CIKM '07.

[23]  Andrew McCallum,et al.  Learning Extractors from Unlabeled Text using Relevant Databases , 2007 .

[24]  Min-Yen Kan,et al.  Keyphrase Extraction in Scientific Publications , 2007, ICADL.

[25]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[26]  Rosaldo J. F. Rossetti,et al.  Mobility Network Evaluation in the User Perspective: Real-Time Sensing of Traffic Information in Twitter Messages , 2010 .

[27]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[28]  Michelle R. Guy,et al.  Twitter earthquake detection: earthquake monitoring in a social world , 2012 .

[29]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[30]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.