论文信息 - Outlier detection from large distributed databases

Outlier detection from large distributed databases

In this paper, we present an innovative system, coined as DISTROD (a.k.a DISTRibuted Outlier Detector), for detecting outliers, namely abnormal instances or observations, from multiple large distributed databases. DISTROD is able to effectively detect the so-called global outliers from distributed databases that are consistent with those produced by the centralized detection paradigm. DISTROD is equipped with a number of optimization/boosting strategies which empower it to significantly enhance its speed performance and reduce its communication overhead. Experimental evaluation demonstrates the good performance of DISTROD in terms of speed and communication overhead.

Ji Zhang | Hua Wang | Xiaohui Tao

[1] Dimitrios Gunopulos,et al. Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[2] Anthony K. H. Tung,et al. Mining top-n local outliers in large databases , 2001, KDD '01.

[3] Tok Wang Ling,et al. HOS-Miner: A System for Detecting Outlying Subspaces of High-dimensional Data , 2004, VLDB.

[4] Bo Sheng,et al. Outlier detection in sensor networks , 2007, MobiHoc '07.

[5] Jiawei Han,et al. Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[6] Douglas M. Hawkins. Identification of Outliers , 1980, Monographs on Applied Probability and Statistics.

[7] Hans-Peter Kriegel,et al. LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[8] Michael Georgiopoulos,et al. A fast outlier detection strategy for distributed high-dimensional data sets with mixed attributes , 2010, Data Mining and Knowledge Discovery.

[9] Sridhar Ramaswamy,et al. Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[10] Wenjiang Huang,et al. A Novel Outlier Detection Algorithm for Distributed Databases , 2008, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery.

[11] Mark Crovella,et al. Distributed Spatial Anomaly Detection , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.