Non-parametric Information-Theoretic Measures of One-Dimensional Distribution Functions from Continuous Time Series

We study non-parametric measures for the problem of comparing distributions, which arise in anomaly detection for continuous time series. Non-parametric measures take two distributions as input and produce two numbers as output: the difference between the input distributions and the statistical significance of this difference. Some of these measures, such as Kullback-Leibler measure, are defined for comparing probability distribution functions (PDFs) and some others, such as Kolmogorov-Smirnov measure, are for cumulative distribution functions (CDFs). We first show how to adapt the PDF based measures to compare CDFs, resulting in a total of 23 CDF based measures. We then provide a unified functional form that subsumes all these measures. We present our methodology to determine the significance (of the measures) by simulations only. Finally, we evaluate these measures for the anomaly detection in continuous time series.

[1]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[2]  Flemming Topsøe,et al.  Jensen-Shannon divergence and Hilbert space embedding , 2004, International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings..

[3]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[4]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[5]  Jaideep Srivastava,et al.  A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection , 2003, SDM.

[6]  Gene H. Golub,et al.  Matrix computations , 1983 .

[7]  AN Kolmogorov-Smirnov,et al.  Sulla determinazione empírica di uma legge di distribuzione , 1933 .

[8]  Shen-Shyang Ho,et al.  A martingale framework for concept change detection in time-varying data streams , 2005, ICML.

[9]  Amiel Feinstein,et al.  Information and information stability of random variables and processes , 1964 .

[10]  J. Jensen Sur les fonctions convexes et les inégalités entre les valeurs moyennes , 1906 .

[11]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[12]  Don H. Johnson,et al.  Symmetrizing the Kullback-Leibler Distance , 2001 .

[13]  H. Chernoff A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[14]  Inder Jeet Taneja,et al.  Relative information of type s, Csiszár's f-divergence, and information inequalities , 2004, Inf. Sci..

[15]  Peter R. Winters,et al.  Forecasting Sales by Exponentially Weighted Moving Averages , 1960 .

[16]  T. Kailath The Divergence and Bhattacharyya Distance Measures in Signal Selection , 1967 .

[17]  Hector Garcia-Molina,et al.  SCAM: A Copy Detection Mechanism for Digital Documents , 1995, DL.

[18]  Lillian Lee,et al.  Measures of Distributional Similarity , 1999, ACL.

[19]  I. Vajda On thef-divergence and singularity of probability measures , 1972 .

[20]  G. Mcrobert Biographical Memoirs of Fellows of the Royal Society , 1975 .

[21]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[22]  David G. Kendall,et al.  Andrei Nikolaevich Kolmogorov, 25 April 1903 - 20 October 1987 , 1991, Biographical Memoirs of Fellows of the Royal Society.

[23]  A. Hope A Simplified Monte Carlo Significance Test Procedure , 1968 .

[24]  Michael McGill,et al.  A performance evaluation of similarity measures, document term weighting schemes and representations in a Boolean environment , 1980, SIGIR '80.

[25]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[26]  Hans Hahn,et al.  Über die Integrale des Herrn Hellinger und die Orthogonalinvarianten der quadratischen Formen von unendlich vielen Veränderlichen , 1912 .

[27]  Lajos Takács,et al.  A bernoulli excursion and its various applications , 1991, Advances in Applied Probability.

[28]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[29]  C. Holt Author's retrospective on ‘Forecasting seasonals and trends by exponentially weighted moving averages’ , 2004 .

[30]  Srinivasan Parthasarathy,et al.  LOADED: link-based outlier and anomaly detection in evolving data sets , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[31]  Andrew K. C. Wong,et al.  Entropy and Distance of Random Graphs with Application to Structural Pattern Recognition , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  I. Grosse,et al.  Analysis of symbolic sequences using the Jensen-Shannon divergence. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[33]  Yiyu Yao,et al.  An analysis of vector space models based on computational geometry , 1992, SIGIR '92.

[34]  Chuang Liu,et al.  Anomaly detection and diagnosis in grid environments , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[35]  Joseph A. O'Sullivan,et al.  Information-Theoretic Image Formation , 1998, IEEE Trans. Inf. Theory.

[36]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[37]  S. Venkatasubramanian,et al.  An Information-Theoretic Approach to Detecting Changes in Multi-Dimensional Data Streams , 2006 .

[38]  Alexander Gammerman,et al.  Testing Exchangeability On-Line , 2003, ICML.

[39]  Bruce G. Batchelor,et al.  Pattern Recognition: Ideas in Practice , 1978 .

[40]  Massimo Melucci,et al.  On rank correlation in information retrieval evaluation , 2007, SIGF.

[41]  George W. Furnas,et al.  Pictures of relevance: A geometric analysis of similarity measures , 1987, J. Am. Soc. Inf. Sci..

[42]  T. W. Anderson On the Distribution of the Two-Sample Cramer-von Mises Criterion , 1962 .

[43]  Nicholas Zabaras,et al.  An Information-Theoretic Approach to Stochastic Materials Modeling , 2007, Computing in Science & Engineering.

[44]  Arie Harel Random Walk and the Area Below its Path , 1993, Math. Oper. Res..

[45]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[46]  S. M. Ali,et al.  A General Class of Coefficients of Divergence of One Distribution from Another , 1966 .

[47]  C. R. Rao,et al.  Cross entropy, dissimilarity measures, and characterizations of quadratic entropy , 1985, IEEE Trans. Inf. Theory.