Entropy estimates of small data sets

Estimating entropies from limited data series is known to be a non-trivial task. Naive estimations are plagued with both systematic (bias) and statistical errors. Here, we present a new 'balanced estimator' for entropy functionals (Shannon, Renyi and Tsallis) specially devised to provide a compromise between low bias and small statistical errors, for short data series. This new estimator outperforms other currently available ones when the data sets are small and the probabilities of the possible outputs of the random variable are not close to zero. Otherwise, other well-known estimators remain a better choice. The potential range of applicability of this estimator is quite broad specially for biological and digital data series.

[1]  T. Dudok de Wit,et al.  When do finite sample effects significantly affect entropy estimates? , 1999 .

[2]  Liam Paninski,et al.  Estimation of Entropy and Mutual Information , 2003, Neural Computation.

[3]  A. J. McKane,et al.  Stochastic models of evolution in genetics, ecology and linguistics , 2007, cond-mat/0703478.

[4]  M. Roulston Estimating the errors on measured entropy and mutual information , 1999 .

[5]  Holger Kantz,et al.  Enlarged scaling ranges for the KS-entropy and the information dimension. , 1996, Chaos.

[6]  W. E. Hick,et al.  Information theory in psychology , 1953, Trans. IRE Prof. Group Inf. Theory.

[7]  D. Holste,et al.  Bayes' estimators of generalized entropies , 1998 .

[8]  M. V. Rossum,et al.  In Neural Computation , 2022 .

[9]  P. Grassberger Finite sample corrections to entropy and dimension estimates , 1988 .

[10]  C. Tsallis Possible generalization of Boltzmann-Gibbs statistics , 1988 .

[11]  A Mendelian Markov process with binomial transition probabilities. , 1966, Biometrika.

[12]  Peter Grassberger,et al.  Entropy estimation of symbol sequences. , 1996, Chaos.

[13]  Werner Ebeling,et al.  Word frequency and entropy of symbolic sequences: a dynamical perspective , 1992 .

[14]  David R. Wolf,et al.  Estimating functions of probability distributions from a finite set of samples. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[15]  Alfréd Rényi,et al.  Probability Theory , 1970 .

[16]  Thomas Schürmann Bias analysis in entropy estimation , 2004 .

[17]  Stanford Goldman,et al.  Information theory , 1953 .

[18]  William Bialek,et al.  Entropy and information in neural spike trains: progress on the sampling problem. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  W. Ebeling,et al.  Guessing probability distributions from small samples , 1995, cond-mat/0203467.

[20]  G. Basharin On a Statistical Estimate for the Entropy of a Sequence of Independent Random Variables , 1959 .

[21]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[22]  H. Quastler Information theory in psychology , 1955 .

[23]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.