An evaluation framework for traffic information systems based on data streams

Abstract Traffic information systems have to process and analyze huge amounts of data in real-time to effectively provide traffic information to road users. Progress in mobile communication technology with higher bandwidths and lower latencies enables the use of data provided by in-car sensors. Data stream management systems have been proposed to address the challenges of such applications which have to process a continuous data flow from various data sources in real-time. Data mining methods, adapted to data streams, can be used to analyze the data and to identify interesting patterns such as congestion or road hazards. Although several data stream mining methods have been proposed, an evaluation of such methods in the context of traffic applications is yet missing. In this paper, we present an evaluation framework for traffic information systems based on data streams. We apply a traffic simulation software to emulate the generation of traffic data by mobile probes. The framework is applied in two case studies, namely queue-end detection and traffic state estimation. The results show which parameters of the traffic information system significantly impact the accuracy of the predicted traffic information. This provides important findings for the design and implementation of traffic information systems using data from mobile probes.

[1]  Martín Abadi,et al.  Security analysis of cryptographically controlled access to XML documents , 2005, PODS '05.

[2]  João Gama,et al.  Issues in evaluation of stream learning algorithms , 2009, KDD.

[3]  Stanley B. Zdonik,et al.  Stream-Oriented Query Languages and Operators , 2009, Encyclopedia of Database Systems.

[4]  Michael Schreckenberg,et al.  A cellular automaton model for freeway traffic , 1992 .

[5]  Matthew G. Karlaftis,et al.  A multivariate state space approach for urban traffic flow modeling and prediction , 2003 .

[6]  Thierry Delot,et al.  Highly mobile query processing , 2010 .

[7]  Susan Grant-Muller,et al.  Use of sequential learning for short-term traffic flow forecasting , 2001 .

[8]  Albert Bifet,et al.  DATA STREAM MINING A Practical Approach , 2009 .

[9]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[10]  Brian Lee Smith,et al.  Part 1: Freeway Operations: Probe-Based Traffic Monitoring Systems with Wireless Location Technology: An Investigation of the Relationship Between System Design and Effectiveness , 2005 .

[11]  Jennifer Widom,et al.  STREAM: The Stanford Stream Data Manager , 2003, IEEE Data Eng. Bull..

[12]  Karl Aberer,et al.  A middleware for fast and flexible sensor network deployment , 2006, VLDB.

[13]  Jennifer Widom,et al.  Flexible time management in data stream systems , 2004, PODS.

[14]  Badrish Chandramouli,et al.  Spatio-Temporal Stream Processing in Microsoft StreamInsight , 2010, IEEE Data Eng. Bull..

[15]  Sandra Geisler,et al.  Accuracy Assessment for Traffic Information Derived from Floating Phone Data , 2010 .

[16]  Philip S. Yu,et al.  A Framework for Clustering Uncertain Data Streams , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[17]  Demetrios Zeinalipour-Yazti,et al.  Ieee Icdm 2010 Contest Tomtom Traffic Prediction for Intelligent Gps Navigation , 2022 .

[18]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[19]  Jennifer Widom,et al.  Database systems - the complete book (2. ed.) , 2009 .

[20]  Geoff Hulten,et al.  A General Framework for Mining Massive Data Streams , 2003 .

[21]  MüllerKlaus-Robert,et al.  Incremental Support Vector Learning: Analysis, Implementation and Applications , 2006 .

[22]  Brian Lee Smith,et al.  Probe-Based Traffic Monitoring Systems with Wireless Location Technology , 2005 .

[23]  Michael Stonebraker,et al.  Linear Road: A Stream Data Management Benchmark , 2004, VLDB.

[24]  Brian L. Smith,et al.  Investigation of the Performance of Wireless Location Technology-Based Traffic Monitoring Systems , 2007 .

[25]  Asma Munir Khan Intelligent infrastructure-based queue-end warning system for avoiding rear impacts , 2007 .

[26]  Timos K. Sellis,et al.  Window Specification over Data Streams , 2006, EDBT Workshops.

[27]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[28]  Lukasz Golab,et al.  Data Stream Management , 2017, Data Stream Management.

[29]  Sandra Geisler,et al.  A data stream-based evaluation framework for traffic information systems , 2010, IWGS '10.

[30]  Jianhong Zhou,et al.  A Scalable Distributed Stream Mining System for Highway Traffic Data , 2006, PKDD.

[31]  Mohamed Medhat Gaber,et al.  A fuzzy approach for interpretation of ubiquitous data stream clustering and its application in road safety , 2007, Intell. Data Anal..

[32]  João Gama,et al.  An Overview on Mining Data Streams , 2009, Foundations of Computational Intelligence.

[33]  David Bernstein,et al.  Some map matching algorithms for personal navigation assistants , 2000 .

[34]  Michael Stonebraker,et al.  The 8 requirements of real-time stream processing , 2005, SGMD.

[35]  Ronald Kates,et al.  Ein hybrides Modell basierend auf einem Neuronalen Netz und einem ARIMA-Zeitreihenmodell zur Prognose lokaler Verkehrskenngroessen , 2003 .

[36]  Thomas Seidl,et al.  MOA: Massive Online Analysis, a Framework for Stream Classification and Clustering , 2010, WAPA.

[37]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[38]  Ugur Çetintemel,et al.  Data Stream Management Architectures and Prototypes , 2009, Encyclopedia of Database Systems.

[39]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[40]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[41]  Opher Etzion,et al.  Event Processing in Action , 2010 .

[42]  Plamen P. Angelov,et al.  On line learning fuzzy rule-based system structure from data streams , 2008, 2008 IEEE International Conference on Fuzzy Systems (IEEE World Congress on Computational Intelligence).

[43]  Markos Papageorgiou,et al.  Real-time freeway traffic state estimation based on extended Kalman filter: a general approach , 2005 .

[44]  Cynthia Rudin,et al.  Online coordinate boosting , 2008, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[45]  B. Kerner THE PHYSICS OF TRAFFIC , 1999 .

[46]  Jennifer Widom,et al.  The CQL continuous query language: semantic foundations and query execution , 2006, The VLDB Journal.

[47]  Jennifer Widom,et al.  Towards a streaming SQL standard , 2008, Proc. VLDB Endow..

[48]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[49]  Sandra Geisler,et al.  A Quality- and Priority-Based Traffic Information Fusion Architecture , 2009 .

[50]  Albert Bifet,et al.  Massive Online Analysis , 2009 .

[51]  Sattar Hashemi,et al.  Adapted One-versus-All Decision Trees for Data Stream Classification , 2009, IEEE Transactions on Knowledge and Data Engineering.

[52]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[53]  Hillol Kargupta,et al.  MineFleet®: an overview of a widely adopted distributed vehicle performance data mining system , 2010, KDD.

[54]  Carlo Zaniolo,et al.  A data stream language and system designed for power and extensibility , 2006, CIKM '06.

[55]  Alain Biem,et al.  IBM infosphere streams for scalable, real-time, intelligent transportation services , 2010, SIGMOD Conference.

[56]  Jennifer Widom,et al.  Database Systems: The Complete Book , 2001 .

[57]  Haris N. Koutsopoulos,et al.  Network State Estimation and Prediction for Real-Time Traffic Management , 2001 .

[58]  Ying Xing,et al.  The Design of the Borealis Stream Processing Engine , 2005, CIDR.

[59]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[60]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[61]  Chris Lee,et al.  Real-Time Identification and Tracking of Traffic Queues Based on Average Link Speed , 2003 .

[62]  Carlo Zaniolo,et al.  Designing an inductive data stream management system: the stream mill experience , 2008, SSPS '08.

[63]  Wilfred Ng,et al.  A survey on algorithms for mining frequent itemsets over data streams , 2008, Knowledge and Information Systems.

[64]  Ling Liu,et al.  Encyclopedia of Database Systems , 2009, Encyclopedia of Database Systems.

[65]  Stuart J. Russell,et al.  Online bagging and boosting , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[66]  Klaus-Robert Müller,et al.  Incremental Support Vector Learning: Analysis, Implementation and Applications , 2006, J. Mach. Learn. Res..

[67]  Badrish Chandramouli,et al.  Real-time spatio-temporal analytics using Microsoft StreamInsight , 2010, GIS '10.