From Twitter to Traffic Predictor: Next-Day Morning Traffic Prediction Using Social Media Data

The effectiveness of traditional traffic prediction methods is often extremely limited when forecasting traffic dynamics in early morning. The reason is that traffic can break down drastically during the early morning commute, and the time and duration of this break-down vary substantially from day to day. Early morning traffic forecast is crucial to inform morning-commute traffic management, but they are generally challenging to predict in advance, particularly by midnight. In this paper, we propose to mine Twitter messages as a probing method to understand the impacts of people's work and rest patterns in the evening/midnight of the previous day to the next-day morning traffic. The model is tested on freeway networks in Pittsburgh as experiments. The resulting relationship is surprisingly simple and powerful. We find that, in general, the earlier people rest as indicated from Tweets, the more congested roads will be in the next morning. The occurrence of big events in the evening before, represented by higher or lower tweet sentiment than normal, often implies lower travel demand in the next morning than normal days. Besides, people's tweeting activities in the night before and early morning are statistically associated with congestion in morning peak hours. We make use of such relationships to build a predictive framework which forecasts morning commute congestion using people's tweeting profiles extracted by 5 am or as late as the midnight prior to the morning. The Pittsburgh study supports that our framework can precisely predict morning congestion, particularly for some road segments upstream of roadway bottlenecks with large day-to-day congestion variation. Our approach considerably outperforms those existing methods without Twitter message features, and it can learn meaningful representation of demand from tweeting profiles that offer managerial insights.

[1]  Riccardo Guidotti,et al.  The GRAAL of carpooling: GReen And sociAL optimization from crowd-sourced data , 2017 .

[2]  Feng Chen,et al.  From Twitter to detector: real-time traffic incident detection using social media data , 2016 .

[3]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[4]  Satish V. Ukkusuri,et al.  Urban activity pattern classification using topic models from online geo-location data , 2014 .

[5]  Filippo Menczer,et al.  BotOrNot: A System to Evaluate Social Bots , 2016, WWW.

[6]  Bing Liu,et al.  Mining Opinion Features in Customer Reviews , 2004, AAAI.

[7]  Axel Schulz,et al.  I See a Car Crash: Real-Time Detection of Small Scale Incidents in Microblogs , 2013, ESWC.

[8]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[9]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[10]  Susan Grant-Muller,et al.  New and emerging data forms in transportation planning and policy: Opportunities and challenges for “Track and Trace” data , 2020, Transportation Research Part C: Emerging Technologies.

[11]  Zhenhua Zhanga,et al.  A deep learning approach for detecting tra ffi c accidents from social media data , 2017 .

[12]  Kristina Lerman,et al.  Travel analytics: Understanding how destination choice and business clusters are connected based on social media data ☆ , 2017 .

[13]  Kuilin Zhang,et al.  Observing individual dynamic choices of activity chains from location-based crowdsourced data , 2017 .

[14]  Zhenhua Zhang,et al.  Exploratory Study on Correlation Between Twitter Concentration and Traffic Surges , 2016 .

[15]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Peter J. Jin,et al.  An adaptive hawkes process formulation for estimating time-of-day zonal trip arrivals with location-based social networking check-in data , 2017 .

[17]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[18]  Qing He,et al.  Forecasting the Subway Passenger Flow Under Event Occurrences With Social Media , 2017, IEEE Transactions on Intelligent Transportation Systems.

[19]  Stephen Graham Ritchie,et al.  TRANSPORTATION RESEARCH. PART C, EMERGING TECHNOLOGIES , 1993 .

[20]  Jiebo Luo,et al.  Inferring Fine-grained Details on User Activities and Home Location from Social Media: Detecting Drinking-While-Tweeting Patterns in Communities , 2016, ArXiv.

[21]  Shanjiang Zhu,et al.  Potentials of using social media to infer the longitudinal travel behavior: A sequential model-based clustering method , 2017 .

[22]  Robert L. Bertini,et al.  Using Travel Time Reliability Measures to Improve Regional Transportation Planning and Operations , 2008 .

[23]  Lun Wu,et al.  Intra-Urban Human Mobility and Activity Transition: Evidence from Social Media Check-In Data , 2014, PloS one.

[24]  Qing He,et al.  Predicting gasoline shortage during disasters using social media , 2019, OR Spectrum.

[25]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[26]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[27]  Yu Cui,et al.  Forecasting current and next trip purpose with social media data and Google Places , 2018, Transportation Research Part C: Emerging Technologies.

[28]  A. Alavi,et al.  Opportunities and Challenges , 1998, In Vitro Diagnostic Industry in China.

[29]  Qing He,et al.  Social Media in Transportation Research and Promising Applications , 2018, Complex Networks and Dynamic Systems.

[30]  Anthony S. Bryk,et al.  Hierarchical Linear Models: Applications and Data Analysis Methods , 1992 .

[31]  Eleonora D'Andrea,et al.  Real-Time Detection of Traffic From Twitter Stream Analysis , 2015, IEEE Transactions on Intelligent Transportation Systems.

[32]  Satish V. Ukkusuri,et al.  Understanding urban human activity and mobility patterns using large-scale location-based data from online social media , 2013, UrbComp '13.

[33]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[34]  Kyung Sup Kwak,et al.  Fuzzy Ontology-Based Sentiment Analysis of Transportation and City Feature Reviews for Safe Traveling , 2017, ArXiv.

[35]  Michael W. Berry,et al.  Document clustering using nonnegative matrix factorization , 2006, Inf. Process. Manag..

[36]  T. Rashidi,et al.  Exploring the capacity of social media data for modelling travel behaviour: Opportunities and challenges , 2017 .

[37]  Pinchao Zhang,et al.  User-centric interdependent urban systems: Using time-of-day electricity usage data to predict morning roadway congestion , 2017, Transportation Research Part C: Emerging Technologies.

[38]  Francisco C. Pereira,et al.  Predicting taxi demand hotspots using automated Internet Search Queries , 2019, Transportation Research Part C: Emerging Technologies.

[39]  Jie Lin,et al.  Inferring the home locations of Twitter users based on the spatiotemporal clustering of Twitter data , 2018, Trans. GIS.

[40]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[41]  Tsvi Kuflik,et al.  Automating a framework to extract and analyse transport related social media content: The potential and the challenges , 2017 .

[42]  Wei Shen,et al.  Improving Traffic Prediction with Tweet Semantics , 2013, IJCAI.

[43]  Adel W. Sadek,et al.  Modeling the Impacts of Inclement Weather on Freeway Traffic Speed , 2015 .

[44]  Antony Stathopoulos,et al.  A utility-maximization model for retrieving users’ willingness to travel for participating in activities from big-data , 2015 .

[45]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[46]  H. Michael Zhang,et al.  Full Closure or Partial Closure? Evaluation of Construction Plans for the I-5 Closure in Downtown Sacramento , 2013 .

[47]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[48]  Akshi Kumar,et al.  Sentiment Analysis on Twitter , 2012 .

[49]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[50]  Hiroki Sayama,et al.  Visualizing the "heartbeat" of a city with tweets , 2014, Complex..

[51]  Chao Huang,et al.  Towards unsupervised home location inference from online social media , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[52]  J. Nelson,et al.  Tweeting Transit: An examination of social media strategies for transport information management during a large event ☆ , 2017 .

[53]  Rashid Mehmood,et al.  Automatic Event Detection in Smart Cities Using Big Data Analytics , 2017 .

[54]  Weiran Yao,et al.  Learning to Recommend Signal Plans under Incidents with Real-Time Traffic Prediction , 2020, ArXiv.

[55]  Jian Pei,et al.  Urban Traffic Prediction through the Second Use of Inexpensive Big Data from Buildings , 2016, CIKM.

[56]  Alexander Zipf,et al.  Mining and correlating traffic events from human sensor observations with official transport data using self-organizing-maps , 2016 .