Distance-based k-nearest neighbors outlier detection method in large-scale traffic data

This paper presents a k-nearest neighbors (kNN) method to detect outliers in large-scale traffic data collected daily in every modern city. Outliers include hardware and data errors as well as abnormal traffic behaviors. The proposed kNN method detects outliers by exploiting the relationship among neighborhoods in data points. The farther a data point is beyond its neighbors, the more possible the data is an outlier. Traffic data here was recorded in a video format, and converted to spatial-temporal (ST) traffic signals by statistics. The ST signals are then transformed to a two-dimensional (2D) (x, y) -coordinate plane by Principal Component Analysis (PCA) for dimension reduction. The distance-based kNN method is evaluated by unsupervised and semi-supervised approaches. The semi-supervised approach reaches 96.19% accuracy.

[1]  Hans-Peter Kriegel,et al.  Angle-based outlier detection in high-dimensional data , 2008, KDD.

[2]  Pasi Fränti,et al.  Outlier Detection Using k-Nearest Neighbour Graph , 2004, ICPR.

[3]  Shawn Turner,et al.  Empirical Approaches to Outlier Detection in Intelligent Transportation Systems Data , 2003 .

[4]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[5]  VARUN CHANDOLA,et al.  Outlier Detection : A Survey , 2007 .

[6]  Wei Wang,et al.  A comparison of outlier detection algorithms for ITS data , 2010, Expert Syst. Appl..

[7]  Pranjali Kuche,et al.  Traffic Sign Recognition System , 2016 .

[8]  Dilin Wang,et al.  Parallel Construction of Approximate kNN Graph , 2012, 2012 11th International Symposium on Distributed Computing and Applications to Business, Engineering & Science.

[9]  Nelson H. C. Yung,et al.  Outlier Detection in Traffic Data Based on the Dirichlet Process Mixture Model , 2015 .

[10]  Nozha Boujemaa,et al.  Large Scale KNN-Graph Approximation , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[11]  Martijn Onderwater,et al.  Detecting unusual user proles with outlier detection techniques , 2010 .

[12]  Michael K. Ng,et al.  Patterned Fabric Inspection and Visualization by the Method of Image Decomposition , 2014, IEEE Transactions on Automation Science and Engineering.