New local density definition based on minimum hyper sphere for outlier mining algorithm using in industrial databases

Outlier detection is an important procedure in industrial dataset preprocess to guarantee the industrial process operating normally. This paper proposed a new local density definition in the basis of the minimum hyper sphere for outlier mining algorithm. First, the novel local k-density definition of an object is proposed by using the minimum enclosing hyper sphere algorithm. After this, the new k-density definition is adopt in local outlier factor (LOF) algorithm, INFLuenced Outlierness (INFLO) algorithm, and the density-similarity-neighbor-based outlier mining (DSNOF) algorithm constructing ndLOF algorithm, ndINFLO algorithm, and ndDSNOF algorithm. Finally, we evaluate the performance of ndLOF algorithm, ndINFLO algorithm, and ndDSNOF algorithm with LOF algorithm, INFLO algorithm, and DSNOF algorithm on synthetic datasets. The experiments results confirm that the presented definition is meaningful and the outlier mining algorithms improved by the new definition have higher quality of outlier mining.

[1]  Joseph S. B. Mitchell,et al.  Approximate minimum enclosing balls in high dimensions using core-sets , 2003, ACM J. Exp. Algorithmics.

[2]  Yanbin Zhang,et al.  Enhancing effectiveness of density-based outlier mining scheme with density-similarity-neighbor-based outlier factor , 2010, Expert Syst. Appl..

[3]  Anthony K. H. Tung,et al.  Ranking Outliers Using Symmetric Neighborhood Relationship , 2006, PAKDD.

[4]  Douglas M. Hawkins Identification of Outliers , 1980, Monographs on Applied Probability and Statistics.

[5]  Jugal K. Kalita,et al.  A Survey of Outlier Detection Methods in Network Anomaly Identification , 2011, Comput. J..

[6]  A. Sorsa,et al.  Sensor Validation And Outlier Detection Using Fuzzy Limits , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[7]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[8]  Jian Tang,et al.  Capabilities of outlier detection schemes in large datasets, framework and methodologies , 2006, Knowledge and Information Systems.

[9]  Chao-Hsien Chu,et al.  A Review of Data Mining-Based Financial Fraud Detection Research , 2007, 2007 International Conference on Wireless Communications, Networking and Mobile Computing.

[10]  Anthony K. H. Tung,et al.  Mining top-n local outliers in large databases , 2001, KDD '01.