An outlier detection algorithm based on the degree of sharpness and its applications on traffic big data preprocessing

Outlier detection is one important research area of data mining, which plays key roles in data preprocessing, equipment fault diagnosis, credit fraud detection, traffic incident detection etc. This paper is devoted to a new outlier detection algorithm based on the degree of sharpness. The proposed algorithm takes a new way to solve the outlier detection problem, which employs a measure in image processing, degree of sharpness, to detect the outliers. Compared to the classical outlier detection methods with statistical learning, the proposed algorithm has no iterative processes. It generates a smooth curve to describe the overall distribution of the data firstly, and then computes the sharpness of degree for each data point. Finally, the outliers are recognized as they have larger values of the degree of sharpness. Also, some practical applications on traffic big data are presented to prove the effectiveness of the proposed algorithm.

[1]  Madhu Shukla,et al.  A review on outlier detection techniques on data stream by using different approaches of K-Means algorithm , 2015, 2015 International Conference on Advances in Computer Engineering and Applications.

[2]  Yuncai Liu,et al.  Traffic incident detection by multiple kernel support vector machine ensemble , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[3]  Heiko Paulheim,et al.  A decomposition of the outlier detection problem into a set of supervised learning problems , 2015, Machine Learning.

[4]  Dechang Pi,et al.  Density-based trajectory outlier detection algorithm , 2013 .

[5]  Yuncai Liu,et al.  Traffic Incident Detection Using Multiple-Kernel Support Vector Machine , 2012 .

[6]  Gang Hua,et al.  Learning Discriminative Reconstructions for Unsupervised Outlier Removal , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[7]  Bin Wang,et al.  Adaptive algorithm for corner detecting based on the degree of sharpness of the contour , 2011 .

[8]  I. S. Sitanggang,et al.  Outlier Detection on Hotspots Data in Riau Province using DBSCAN Algorithm , 2016 .

[9]  Alexandros Nanopoulos,et al.  Reverse Nearest Neighbors in Unsupervised Distance-Based Outlier Detection , 2015, IEEE Transactions on Knowledge and Data Engineering.

[10]  Derong Shen,et al.  Cluster-Based Outlier Detection Using Unsupervised Extreme Learning Machines , 2016 .

[11]  Peter Filzmoser,et al.  Identification of Multivariate Outliers: A Performance Study , 2016 .

[12]  Jeffrey E. Thatcher,et al.  Outlier detection and removal improves accuracy of machine learning approach to multispectral burn diagnostic imaging , 2015, Journal of biomedical optics.