A Unified Notion of Outliers: Properties and Computation

As said in signal processing, "One person's noise is another person's signal." For many applications, such as the exploration of satellite or medical images, and the monitoring of criminal activities in electronic commerce, identifying exceptions can often lead to the discovery of truly unexpected knowledge. In this paper, we study an intuitive notion of outliers. A key contribution of this paper is to show how the proposed notion of outliers unifies or generalizes many existing notions of outliers provided by discordancy tests for standard statistical distributions. Thus, a unified outlier detection system can replace a whole spectrum of statistical discordancy tests with a single module detecting only the kinds of outliers proposed. A second contribution of this paper is the development of an approach to find all outliers in a dataset. The structure underlying this approach resembles a data cube, which has the advantage of facilitating integration with the many OLAP and data mining systems using data cubes.