Information theoretic novelty detection

We present a novel approach to online change detection problems when the training sample size is small. The proposed approach is based on estimating the expected information content of a new data point and allows an accurate control of the false positive rate even for small data sets. In the case of the Gaussian distribution, our approach is analytically tractable and closely related to classical statistical tests. We then propose an approximation scheme to extend our approach to the case of the mixture of Gaussians. We evaluate extensively our approach on synthetic data and on three real benchmark data sets. The experimental validation shows that our method maintains a good overall accuracy, but significantly improves the control over the false positive rate.

[1]  John R. Hershey,et al.  Approximating the Kullback Leibler Divergence Between Gaussian Mixture Models , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[2]  Geoffrey J. McLachlan,et al.  Robust mixture modelling using the t distribution , 2000, Stat. Comput..

[3]  Christopher M. Bishop,et al.  Novelty detection and neural network validation , 1994 .

[4]  Vic Barnett,et al.  Wiley Series in Probability and Statistics , 1999 .

[5]  Pierre Baldi,et al.  Bayesian surprise attracts human attention , 2005, Vision Research.

[6]  D. Martinez,et al.  Neural tree density estimation for novelty detection , 1998, IEEE Trans. Neural Networks.

[7]  T. W. Anderson An Introduction to Multivariate Statistical Analysis, 2nd Edition. , 1985 .

[8]  L. Tarassenko,et al.  Bayesian Extreme Value Statistics for Novelty Detection in Gas-Turbine Engines , 2008, 2008 IEEE Aerospace Conference.

[9]  Bernhard Schölkopf,et al.  Support Vector Novelty Detection Applied to Jet Engine Vibration Spectra , 2000, NIPS.

[10]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[11]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[12]  Michael Brady,et al.  Novelty detection for the identification of masses in mammograms , 1995 .

[13]  John A. Quinn,et al.  Factorial Switching Kalman Filters for Condition Monitoring in Neonatal Intensive Care , 2005, NIPS.

[14]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[15]  Colin Campbell,et al.  A Linear Programming Approach to Novelty Detection , 2000, NIPS.

[16]  António M. Baptista,et al.  Parameterized Novelty Detectors for Environmental Sensor Monitoring , 2003, NIPS.

[17]  S. Roberts Novelty detection using extreme value statistics , 1999 .

[18]  G. Sanguinetti,et al.  Novelty detection in autoregressive models using information theoretic measures , 2009 .

[19]  Yoram Singer,et al.  Batch and On-Line Parameter Estimation of Gaussian Mixtures Based on the Joint Entropy , 1998, NIPS.

[20]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .

[21]  John A. Quinn,et al.  Known Unknowns: Novelty Detection in Condition Monitoring , 2007, IbPRIA.

[22]  Chao He,et al.  Employing optimized combinations of one-class classifiers for automated currency validation , 2004, Pattern Recognit..

[23]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..

[24]  Paul Horton,et al.  A Probabilistic Classification System for Predicting the Cellular Localization Sites of Proteins , 1996, ISMB.

[25]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[26]  Yiming Yang,et al.  A Probabilistic Model for Online Document Clustering with Application to Novelty Detection , 2004, NIPS.

[27]  J. Copas,et al.  Interpreting Kullback-Leibler divergence with the Neyman-Pearson lemma , 2006 .

[28]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.