Coping with training contamination in unsupervised distributional anomaly detection

In previous work [1], we presented several distributional approaches to anomaly detection for a speech activity detector by training a model on purely nominal data and estimating the divergence between it and other input. Here, we reformulate the problem in an unsupervised framework and allow for anomalous contamination of the training data. After noting the instability of Gaussian mixture models (GMMs) in this context, we focus on non-parametric methods using regularly binned histograms. While the performance of the log likelihood baseline suffered as the amount of contamination was increased, many of the distributional approaches were not affected. We found that the L1 distance, χ2 statistic, and information theory divergences consistently outperformed the other methods for a variety of contamination levels and test segment lengths.

[1]  Bing Liu,et al.  Learning with Positive and Unlabeled Examples Using Weighted Logistic Regression , 2003, ICML.

[2]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[3]  ニール・ロバート ガーナー、,et al.  Voice activity detector , 1997 .

[4]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[5]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[6]  Claude E. Shannon,et al.  A Mathematical Theory of Communications , 1948 .

[7]  Philip S. Yu,et al.  Partially Supervised Classification of Text Documents , 2002, ICML.

[8]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[9]  Rui Li,et al.  The analysis and applications of adaptive-binning color histograms , 2004, Comput. Vis. Image Underst..

[10]  J. E. Porter,et al.  Normalizations and selection of speech segments for speaker recognition scoring , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[11]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[12]  Kevin Chen-Chuan Chang,et al.  PEBL: positive example based learning for Web page classification using SVM , 2002, KDD.

[13]  Frann Cois Denis,et al.  PAC Learning from Positive Statistical Queries , 1998, ALT.

[14]  Peter J. Rousseeuw,et al.  Robust Regression and Outlier Detection , 2005, Wiley Series in Probability and Statistics.

[15]  Tomaso Poggio,et al.  Computing texture boundaries from images , 1988, Nature.

[16]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[17]  Philip H. S. Torr,et al.  Outlier detection and motion segmentation , 1993, Other Conferences.

[18]  Jaideep Srivastava,et al.  A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection , 2003, SDM.

[19]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[20]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[21]  David C. Smith,et al.  A multivariate speech activity detector based on the syllable rate , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[22]  Joachim M. Buhmann,et al.  Non-parametric similarity measures for unsupervised texture segmentation and image retrieval , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Rémi Gilleron,et al.  Text Classification from Positive and Unlabeled Examples , 2002 .

[24]  Joachim M. Buhmann,et al.  Empirical evaluation of dissimilarity measures for color and texture , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[25]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[26]  Gerard G. L. Meyer,et al.  Unsupervised distributional anomaly detection for a self-diagnostic speech activity detector , 2008, 2008 42nd Annual Conference on Information Sciences and Systems.

[27]  D. Hand,et al.  Unsupervised Profiling Methods for Fraud Detection , 2002 .

[28]  D. Angluin,et al.  Learning From Noisy Examples , 1988, Machine Learning.

[29]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[30]  Rémi Gilleron,et al.  Learning from positive and unlabeled examples , 2000, Theor. Comput. Sci..

[31]  Donald Geman,et al.  Boundary Detection by Constrained Optimization , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Yorick Wilks,et al.  Unsupervised Anomaly Detection , 2007, IJCAI.

[33]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .