Outlier Detection Model Based on SOM for Classification Problem

In many practical classification problems, it often contains some outliers in the data set, which may affect the performance of classification model. To solve this problem, this paper combines the self-organizing mapping network (SOM), the pruning technique and the local outlier factor (LOF), constructs the outlier detection model based on SOM (SOD). Firstly, it clusters with SOM on the training set, and then obtains the new training set by pruning the clustering results. Finally, it detects the outliers by the local outlier factor of each sample on the new training set. The empirical results show that the SOD model has better detection performance compared with some existing outlier detection models, and it can improve the classification accuracy more efficiently through the models trained without the outliers.

[1]  Mattheos K. Protopapas,et al.  Outliers detection in multivariate time series using genetic algorithms , 2014 .

[2]  Shuchita Upadhyaya,et al.  Outlier Detection: Applications And Techniques , 2012 .

[3]  Denis Cousineau,et al.  Outliers detection and treatment: a review , 2010 .

[4]  Sukumar Nandi,et al.  An Outlier Detection Method Based on Clustering , 2011, 2011 Second International Conference on Emerging Applications of Information Technology.

[5]  Charu C. Aggarwal,et al.  Outlier Detection for Temporal Data: A Survey , 2014, IEEE Transactions on Knowledge and Data Engineering.

[6]  Zengyou He,et al.  Discovering cluster-based local outliers , 2003, Pattern Recognit. Lett..

[7]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[8]  Fiorella Lauro,et al.  Fault detection analysis using data mining techniques for a cluster of smart office buildings , 2015, Expert Syst. Appl..

[9]  Teuvo Kohonen,et al.  Essentials of the self-organizing map , 2013, Neural Networks.

[10]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[11]  Ji Zhang,et al.  Advancements of Outlier Detection: A Survey , 2013, EAI Endorsed Trans. Scalable Inf. Syst..

[12]  Raymond T. Ng,et al.  Distance-based outliers: algorithms and applications , 2000, The VLDB Journal.

[13]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .