Outlier Detection in Urban Traffic Flow Distributions

Urban traffic data consists of observations like number and speed of cars or other vehicles at certain locations as measured by deployed sensors. These numbers can be interpreted as traffic flow which in turn relates to the capacity of streets and the demand of the traffic system. City planners are interested in studying the impact of various conditions on the traffic flow, leading to unusual patterns, i.e., outliers. Existing approaches to outlier detection in urban traffic data take into account only individual flow values (i.e., an individual observation). This can be interesting for real time detection of sudden changes. Here, we face a different scenario: The city planners want to learn from historical data, how special circumstances (e.g., events or festivals) relate to unusual patterns in the traffic flow, in order to support improved planing of both, events and the layout of the traffic system. Therefore, we propose to consider the sequence of traffic flow values observed within some time interval. Such flow sequences can be modeled as probability distributions of flows. We adapt an established outlier detection method, the local outlier factor (LOF), to handling flow distributions rather than individual observations. We apply the outlier detection online to extend the database with new flow distributions that are considered inliers. For the validation we consider a special case of our framework for comparison with state-of-the-art outlier detection on flows. In addition, a real case study on urban traffic flow data showcases that our method finds meaningful outliers in the traffic flow data.

[1]  Eric P. Xing,et al.  Dynamic Non-Parametric Mixture Models and the Recurrent Chinese Restaurant Process: with Applications to Evolutionary Clustering , 2008, SDM.

[2]  Wei Liu,et al.  Detection Method in Large-Scale Traffic Data , 2015 .

[3]  Arthur Zimek,et al.  There and back again: Outlier detection between statistical reasoning and data mining algorithms , 2018, WIREs Data Mining Knowl. Discov..

[4]  Mark Crovella,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM '04.

[5]  Charu C. Aggarwal,et al.  Outlier Detection for Temporal Data: A Survey , 2014, IEEE Transactions on Knowledge and Data Engineering.

[6]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[7]  Hans-Peter Kriegel,et al.  Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection , 2012, Data Mining and Knowledge Discovery.

[8]  Nagarajan Kandasamy,et al.  A New Approach to Dimensionality Reduction for Anomaly Detection in Data Traffic , 2016, IEEE Transactions on Network and Service Management.

[9]  Arthur Zimek,et al.  On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study , 2016, Data Mining and Knowledge Discovery.

[10]  Henry Y. T. Ngan,et al.  Traffic outlier detection by density-based bounded local outlier factors , 2016 .

[11]  Arthur Zimek,et al.  Outlier Detection in Urban Traffic Data , 2018, WIMS.

[12]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[13]  Leman Akoglu,et al.  Less is More , 2016, ACM Trans. Knowl. Discov. Data.

[14]  Yanmin Zhu,et al.  A Survey on Trajectory Data Mining: Techniques and Applications , 2016, IEEE Access.

[15]  Geng Yang,et al.  Anomaly-Tolerant Traffic Matrix Estimation via Prior Information Guided Matrix Completion , 2017, IEEE Access.

[16]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[17]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[18]  Gentiane Haesbroeck,et al.  Comparison of local outlier detection techniques in spatial multivariate data , 2017, Data Mining and Knowledge Discovery.

[19]  Jae-Gil Lee,et al.  Temporal Outlier Detection in Vehicle Traffic Data , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[20]  Nelson H. C. Yung,et al.  Outlier Detection in Traffic Data Based on the Dirichlet Process Mixture Model , 2015 .

[21]  Jugal K. Kalita,et al.  A multi-step outlier-based anomaly detection approach to network-wide traffic , 2016, Inf. Sci..

[22]  Hongzhuan Zhao,et al.  ST TD outlier detection , 2017 .