论文信息 - High-Dimensional Outlier Detection: The Subspace Method

High-Dimensional Outlier Detection: The Subspace Method

Many real data sets are very high dimensional. In some scenarios, real data sets may contain hundreds or thousands of dimensions. With increasing dimensionality, many of the conventional outlier detection methods do not work very effectively. This is an artifact of the well-known curse of dimensionality. In high-dimensional space, the data becomes sparse, and the true outliers become masked by the noise effects of multiple irrelevant dimensions, when analyzed in full dimensionality.

[1] Eamonn J. Keogh,et al. Time series shapelets: a new primitive for data mining , 2009, KDD.

[2] Jingrui He,et al. Nearest-Neighbor-Based Active Learning for Rare Category Detection , 2007, NIPS.