k-nearest neighbor estimation of entropies with confidence

We analyze a k-nearest neighbor (k-NN) class of plug-in estimators for estimating Shannon entropy and Rényi entropy. Based on the statistical properties of k-NN balls, we derive explicit rates for the bias and variance of these plug-in estimators in terms of the sample size, the dimension of the samples and the underlying probability distribution. In addition, we establish a central limit theorem for the plug-in estimator that allows us to specify confidence intervals on the entropy functionals. As an application, we use our theory in anomaly detection problems to specify thresholds for achieving desired false alarm rates.

[1]  A. Hero,et al.  Empirical estimation of entropy functionals with confidence , 2010, 1012.4188.

[2]  Qing Wang,et al.  Divergence estimation of continuous distributions based on data-dependent partitions , 2005, IEEE Transactions on Information Theory.

[3]  A. Lendasse,et al.  A boundary corrected expansion of the moments of nearest neighbor distributions , 2010 .

[4]  Oldrich A Vasicek,et al.  A Test for Normality Based on Sample Entropy , 1976 .

[5]  Mark Crovella,et al.  Mining anomalies using traffic feature distributions , 2005, SIGCOMM '05.

[6]  Barnabás Póczos,et al.  Estimation of Renyi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs , 2010, NIPS.

[7]  John W. Fisher,et al.  ICA Using Spacings Estimates of Entropy , 2003, J. Mach. Learn. Res..

[8]  C. Quesenberry,et al.  A nonparametric estimate of a multivariate density function , 1965 .

[9]  Edward C. van der Meulen,et al.  Entropy-Based Tests of Uniformity , 1981 .

[10]  H. Chernoff,et al.  Central Limit Theorems for Interchangeable Processes , 1958, Canadian Journal of Mathematics.

[11]  Martin J. Wainwright,et al.  Estimating Divergence Functionals and the Likelihood Ratio by Convex Risk Minimization , 2008, IEEE Transactions on Information Theory.

[12]  Anil K. Jain,et al.  Image data compression: A review , 1981, Proceedings of the IEEE.

[13]  Dafydd Evans A law of large numbers for nearest neighbour statistics , 2008, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[14]  M. N. Goria,et al.  A new class of random vector entropy estimators and its applications in testing statistical hypotheses , 2005 .

[15]  Alfred O. Hero,et al.  Image registration in high-dimensional feature space , 2005, IS&T/SPIE Electronic Imaging.

[16]  Ramani Duraiswami,et al.  Fast optimal bandwidth selection for kernel density estimation , 2006, SDM.

[17]  Paul P. B. Eggermont,et al.  Best Asymptotic Normality of the Kernel Density Entropy Estimator for Smooth Densities , 1999, IEEE Trans. Inf. Theory.

[18]  Alfred O. Hero,et al.  Applications of entropic spanning graphs , 2002, IEEE Signal Process. Mag..

[19]  A. Hero,et al.  Asymptotic Relations Between Minimal Graphs andfi-entropy , 2003 .

[20]  Marc M. Van Hulle,et al.  Edgeworth Approximation of Multivariate Differential Entropy , 2005, Neural Computation.

[21]  A. Hero,et al.  Robust shrinkage estimation of high-dimensional covariance matrices , 2010 .