Outlier Detection Using Ball Descriptions with Adjustable Metric

Sometimes novel or outlier data has to be detected. The outliers may indicate some interesting rare event, or they should be disregarded because they cannot be reliably processed further. In the ideal case that the objects are represented by very good features, the genuine data forms a compact cluster and a good outlier measure is the distance to the cluster center. This paper proposes three new formulations to find a good cluster center together with an optimized lp-distance measure. Experiments show that for some real world datasets very good classification results are obtained and that, more specifically, the l1-distance is particularly suited for datasets containing discrete feature values.

[1]  Robert P. W. Duin,et al.  Uniform Object Generation for Optimizing One-class Classifiers , 2002, J. Mach. Learn. Res..

[2]  Katrien van Driessen,et al.  A Fast Algorithm for the Minimum Covariance Determinant Estimator , 1999, Technometrics.

[3]  David M. J. Tax,et al.  One-class classification , 2001 .

[4]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[5]  G. Pisier The volume of convex bodies and Banach space geometry , 1989 .

[6]  Josef Schmee,et al.  Outliers in Statistical Data (2nd ed.) , 1986 .

[7]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[8]  P. J. Huber The 1972 Wald Lecture Robust Statistics: A Review , 1972 .

[9]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[10]  D. G. Simpson,et al.  Breakdown robustness of tests , 1990 .

[11]  P. Rousseeuw,et al.  Wiley Series in Probability and Mathematical Statistics , 2005 .

[12]  Robert P. W. Duin,et al.  On the Choice of Smoothing Parameters for Parzen Estimators of Probability Density Functions , 1976, IEEE Transactions on Computers.

[13]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[14]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[15]  Robert P. W. Duin,et al.  Combining One-Class Classifiers , 2001, Multiple Classifier Systems.

[16]  M. M. Moya,et al.  Cueing, feature discovery, and one-class learning for synthetic aperture radar automatic target recognition , 1995, Neural Networks.