Mining the Situation: Spatiotemporal Traffic Prediction With Big Data

With the vast availability of traffic sensors from which traffic information can be derived, a lot of research effort has been devoted to developing traffic prediction techniques, which in turn improve route navigation, traffic regulation, urban area planning, etc. One key challenge in traffic prediction is how much to rely on prediction models that are constructed using historical data in real-time traffic situations, which may differ from that of the historical data and change over time. In this paper, we propose a novel online framework that could learn from the current traffic situation (or context) in real-time and predict the future traffic by matching the current situation to the most effective prediction model trained using historical data. As real-time traffic arrives, the traffic context space is adaptively partitioned in order to efficiently estimate the effectiveness of each base predictor in different situations. We obtain and prove both short-term and long-term performance guarantees (bounds) for our online algorithm. The proposed algorithm also works effectively in scenarios where the true labels (i.e., realized traffic) are missing or become available with delay. Using the proposed framework, the context dimension that is the most relevant to traffic prediction can also be revealed, which can further reduce the implementation complexity as well as inform traffic policy making. Our experiments with real-world data in real-life conditions show that the proposed approach significantly outperforms existing solutions.

[1]  Salvatore J. Stolfo,et al.  The application of AdaBoost for distributed, scalable and on-line learning , 1999, KDD '99.

[2]  Ali H. Sayed,et al.  Diffusion strategies for adaptation and learning over networks: an examination of distributed strategies and network behavior , 2013, IEEE Signal Processing Magazine.

[3]  Ugur Demiryurek,et al.  Utilizing Real-World Transportation Data for Accurate Traffic Prediction , 2012, 2012 IEEE 12th International Conference on Data Mining.

[4]  P. Varaiya,et al.  Components of Congestion: Delay from Incidents, Special Events, Lane Closures, Weather, Potential Ramp Metering Gain, and Excess Demand , 2006 .

[5]  Aleksandrs Slivkins,et al.  Contextual Bandits with Similarity Information , 2009, COLT.

[6]  Chetan Gupta,et al.  Mining traffic incidents to forecast impact , 2012, UrbComp '12.

[7]  G. Giuliano INCIDENT CHARACTERISTICS, FREQUENCY, AND DURATION ON A HIGH VOLUME URBAN FREEWAY , 1989 .

[8]  Robert Givan,et al.  Online Ensemble Learning: An Empirical Study , 2000, Machine Learning.

[9]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[10]  S. Travis Waller,et al.  Naive Bayesian Classifier for Incident Duration Prediction , 2007 .

[11]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[12]  Avrim Blum,et al.  Empirical Support for Winnow and Weighted-Majority Algorithms: Results on a Calendar Scheduling Domain , 2004, Machine Learning.

[13]  Chetan Gupta,et al.  Forecasting Spatiotemporal Impact of Traffic Incidents on Road Networks , 2013, 2013 IEEE 13th International Conference on Data Mining.

[14]  Mihaela van der Schaar,et al.  Discover the Expert: Context-Adaptive Expert Selection for Medical Diagnosis , 2015, IEEE Transactions on Emerging Topics in Computing.

[15]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[16]  Mihaela van der Schaar,et al.  Learning optimal classifier chains for real-time big data mining , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[17]  Mark Herbster,et al.  Tracking the Best Expert , 1995, Machine Learning.

[18]  Mihaela van der Schaar,et al.  Distributed online Big Data classification using context information , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[19]  Jae-Gil Lee,et al.  Temporal Outlier Detection in Vehicle Traffic Data , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[20]  Daniel B. Fambro,et al.  Application of Subset Autoregressive Integrated Moving Average Model for Short-Term Freeway Traffic Volume Forecasting , 1999 .

[21]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[22]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[23]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[24]  Qing Zhao,et al.  Distributed Learning in Multi-Armed Bandit With Multiple Players , 2009, IEEE Transactions on Signal Processing.

[25]  Xing Xie,et al.  Discovering spatio-temporal causal interactions in traffic data streams , 2011, KDD.

[26]  Gonzalo Mateos,et al.  Distributed Sparse Linear Regression , 2010, IEEE Transactions on Signal Processing.

[27]  H. Vincent Poor,et al.  Attribute-Distributed Learning: Models, Limits, and Algorithms , 2011, IEEE Transactions on Signal Processing.

[28]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 2022 .

[29]  Gang-Len Chang,et al.  Empirical Analysis and Modeling of Freeway Incident Duration , 2008, 2008 11th International IEEE Conference on Intelligent Transportation Systems.

[30]  Will Recker,et al.  A Methodological Approach for Estimating Temporal and Spatial Extent of Delays Caused by Freeway Accidents , 2012, IEEE Transactions on Intelligent Transportation Systems.

[31]  Mihaela van der Schaar,et al.  Distributed Online Learning in Social Recommender Systems , 2013, IEEE Journal of Selected Topics in Signal Processing.