Entropy Inference and the James-Stein Estimator, with Application to Nonlinear Gene Association Networks

Entropy is a fundamental quantity in statistics and machine learning. In this note, we present a novel procedure for statistical learning of entropy from high-dimensional small-sample data. Specifically, we introduce a a simple yet very powerful small-sample estimator of the Shannon entropy based on James-Stein-type shrinkage. This results in an estimator that is highly efficient statistically as well as computationally. Despite its simplicity, we show that it outperforms (in part substantially) eight other competing entropy estimation procedures across a diverse range of sampling scenarios and data-generating models, including in cases of severe undersampling. A computer program is available that implements the proposed estimator.

[1]  Leo A. Goodman,et al.  A Simple Method for Improving some Estimators , 1953 .

[2]  William Bialek,et al.  Entropy and Inference, Revisited , 2001, NIPS.

[3]  Christopher B. Burge,et al.  Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals , 2003, RECOMB '03.

[4]  A. Orlitsky,et al.  Always Good Turing: Asymptotically Optimal Probability Estimation , 2003, Science.

[5]  S. Trybuła Some Problems of Simultaneous Minimax Estimation , 1958 .

[6]  B. Efron,et al.  Stein's Estimation Rule and Its Competitors- An Empirical Bayes Approach , 1973 .

[7]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[8]  Wilfred Perks,et al.  Some observations on inverse probability including a new indifference rule , 1947 .

[9]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[10]  C. Stein,et al.  Estimation with Quadratic Loss , 1992 .

[11]  Marvin H. J. Gruber Improving Efficiency by Shrinkage: The James--Stein and Ridge Regression Estimators , 1998 .

[12]  Seymour Geisser,et al.  On Prior Distributions for Binary Trials , 1984 .

[13]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics , 2011 .

[14]  Peter Grassberger,et al.  Entropy estimation of symbol sequences. , 1996, Chaos.

[15]  Douglas R. Stinson,et al.  Cryptography: Theory and Practice , 1995 .

[16]  K. Mengersen,et al.  A Comparison of Bayes–Laplace, Jeffreys, and Other Priors , 2008 .

[17]  Korbinian Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology , 2005 .

[18]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[19]  Bin Yu,et al.  Coverage-adjusted entropy estimation. , 2007, Statistics in medicine.

[20]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[21]  A. Chao,et al.  Nonparametric estimation of Shannon’s index of diversity when there are unseen species in sample , 2004, Environmental and Ecological Statistics.

[22]  P. Holland,et al.  Simultaneous Estimation of Multinomial Cell Probabilities , 1973 .

[23]  Stephen M. Stigler,et al.  The 1988 Neyman Memorial Lecture: A Galtonian Perspective on Shrinkage Estimators , 1990 .

[24]  Olivier Ledoit,et al.  Improved estimation of the covariance matrix of stock returns with an application to portfolio selection , 2003 .

[25]  Raphail E. Krichevsky,et al.  The performance of universal encoding , 1981, IEEE Trans. Inf. Theory.

[26]  William Bialek,et al.  Entropy and Information in Neural Spike Trains , 1996, cond-mat/9603127.

[27]  Alan Agresti,et al.  Bayesian inference for categorical data analysis , 2005, Stat. Methods Appl..

[28]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[29]  James R. Thompson Some Shrinkage Techniques for Estimating the Mean , 1968 .

[30]  I. Good THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS , 1953 .

[31]  D. Holste,et al.  Bayes' estimators of generalized entropies , 1998 .