论文信息 - Estimating the Entropy of Binary Time Series: Methodology, Some Theory and a Simulation Study

Estimating the Entropy of Binary Time Series: Methodology, Some Theory and a Simulation Study

Abstract: Partly motivated by entropy-estimation problems in neuroscience, we present adetailed and extensive comparison between some of the most popular and effective entropyestimation methods used in practice: The plug-in method, four different estimators basedon the Lempel-Ziv (LZ) family of data compression algorithms, an estimator based on theContext-Tree Weighting (CTW) method, and the renewal entropy estimator.M ETHODOLOGY : Three new entropy estimators are introduced; two new LZ-basedestimators, and the “renewal entropy estimator,” which is tailored to data generated by abinary renewal process. For two of the four LZ-based estimators, a bootstrap procedure isdescribed for evaluating their standard error, and a practical rule of thumb is heuristicallyderived for selecting the values of their parameters in practice. T HEORY : We prove that,unlike their earlier versions, the two new LZ-based estimators are universally consistent,that is, they converge to the entropy rate for every ﬁnite-valued, stationary and ergodicprocess. An effective method is derived for the accurate approximation of the entropy rateof a ﬁnite-state hidden Markov model (HMM) with known distribution. Heuristiccalculations are presented and approximate formulas are derived for evaluating the bias andthe standard error of each estimator. S

[1] Liam Paninski,et al. Estimation of Entropy and Mutual Information , 2003, Neural Computation.

[2] Benoist,et al. On the Entropy of DNA: Algorithms and Measurements based on Memory and Rapid Convergence , 1994 .

[3] David Loewenstern,et al. Significantly lower entropy estimates for natural DNA sequences , 1997, Proceedings DCC '97. Data Compression Conference.

[4] A. Antos,et al. Convergence properties of functional estimates for discrete distributions , 2001 .

[5] Aaron D. Wyner,et al. Some asymptotic properties of the entropy of a stationary ergodic data source with applications to data compression , 1989, IEEE Trans. Inf. Theory.

[6] Yuri M. Suhov,et al. Nonparametric Entropy Estimation for Stationary Processesand Random Fields, with Applications to English Text , 1998, IEEE Trans. Inf. Theory.

[7] Steven W. McLaughlin,et al. On the Role of Pattern Matching in Information Theory , 2000 .

[8] B. Pittel. Asymptotical Growth of a Class of Random Trees , 1985 .

[9] Joseph P. Romano,et al. The stationary bootstrap , 1994 .

[10] P. Shields. Entropy and Prefixes , 1992 .

[11] Peter Grassberger,et al. Entropy estimation of symbol sequences. , 1996, Chaos.

[12] John C. Kieffer,et al. Sample converses in source coding theory , 1991, IEEE Trans. Inf. Theory.

[13] Ioannis Kontoyiannis. Second-order noiseless source coding theorems , 1997, IEEE Trans. Inf. Theory.

[14] Idan Segev,et al. The information efficacy of a synapse , 2002, Nature Neuroscience.

[15] Liam Paninski,et al. Estimating entropy on m bins given fewer than m samples , 2004, IEEE Transactions on Information Theory.

[16] Ga Miller,et al. Note on the bias of information estimates , 1955 .

[17] PaninskiLiam. Estimation of entropy and mutual information , 2003 .

[18] Stefano Panzeri,et al. The Upward Bias in Measures of Information Derived from Limited Data Samples , 1995, Neural Computation.

[19] Robert L. Mercer,et al. An Estimate of an Upper Bound for the Entropy of English , 1992, CL.

[20] Abraham Lempel,et al. Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[21] Yun Gao,et al. From the Entropy to the Statistical Structure of Spike Trains , 2006, 2006 IEEE International Symposium on Information Theory.

[22] Jonathan D. Victor,et al. Asymptotic Bias in Information Estimates and the Exponential (Bell) Polynomials , 2000, Neural Computation.

[23] Wojciech Szpankowski,et al. A Generalized Suffix Tree and its (Un)expected Asymptotic Behaviors , 1993, SIAM J. Comput..

[24] Benjamin Weiss,et al. Entropy and data compression schemes , 1993, IEEE Trans. Inf. Theory.

[25] John G. Cleary,et al. The entropy of English using PPM-based models , 1996, Proceedings of Data Compression Conference - DCC '96.

[26] Frans M. J. Willems,et al. The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.

[27] Wojciech Szpankowski,et al. Asymptotic properties of data compression and suffix trees , 1993, IEEE Trans. Inf. Theory.

[28] Frans M. J. Willems,et al. Context weighting for general finite-context sources , 1996, IEEE Trans. Inf. Theory.

[29] Ioannis Kontoyiannis. The complexity and entropy of literary styles , 1997 .

[30] Jonathon Shlens,et al. Estimating Information Rates with Confidence Intervals in Neural Spike Trains , 2007, Neural Computation.

[31] Claude Shannon. Information theory in the brain , 2000 .

[32] Pamela Reinagel,et al. Decoding visual information from a population of retinal ganglion cells. , 1997, Journal of neurophysiology.

[33] Peter Grassberger,et al. Estimating the information content of symbol sequences and efficient codes , 1989, IEEE Trans. Inf. Theory.

[34] I. Ibragimov,et al. Some Limit Theorems for Stationary Processes , 1962 .

[35] G. Basharin. On a Statistical Estimate for the Entropy of a Sequence of Independent Random Variables , 1959 .

[36] Neri Merhav,et al. Hidden Markov processes , 2002, IEEE Trans. Inf. Theory.

[37] Serap A. Savari,et al. On the entropy of DNA: algorithms and measurements based on memory and rapid convergence , 1995, SODA '95.

[38] William Bialek,et al. Entropy and information in neural spike trains: progress on the sampling problem. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[39] Abraham Lempel,et al. A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[40] F. Papangelou,et al. On the entropy rate of stationary point processes and its discrete approximation , 1978 .

[41] Aaron D. Wyner,et al. Improved redundancy of a version of the Lempel-Ziv algorithm , 1995, IEEE Trans. Inf. Theory.

[42] V. Parmon,et al. Entropy and Information , 2009 .

[43] Mark Levene,et al. Computing the Entropy of User Navigation in the Web , 2003, Int. J. Inf. Technol. Decis. Mak..

[44] A. Mees,et al. Context-tree modeling of observed symbolic dynamics. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[45] Jorma Rissanen,et al. Stochastic Complexity in Statistical Inquiry , 1989, World Scientific Series in Computer Science.

[46] José María Amigó,et al. Estimating the Entropy Rate of Spike Trains via Lempel-Ziv Complexity , 2004, Neural Computation.

[47] P. Shields. The Ergodic Theory of Discrete Sample Paths , 1996 .

[48] John H. Reif,et al. Using difficulty of prediction to decrease computation: fast sort, priority queue and convex hull on entropy bounded inputs , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[49] Anthony M. Zador,et al. Information through a Spiking Neuron , 1995, NIPS.

[50] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[51] Ryuji Suzuki,et al. Information entropy of humpback whale songs. , 1999, The Journal of the Acoustical Society of America.

[52] William Bialek,et al. Entropy and Information in Neural Spike Trains , 1996, cond-mat/9603127.

[53] Ioannis Kontoyiannis,et al. Prefixes and the entropy rate for long-range sources , 1994, Proceedings of 1994 IEEE International Symposium on Information Theory.

[54] Philippe Jacquet,et al. On the entropy of a hidden Markov process , 2004, Data Compression Conference, 2004. Proceedings. DCC 2004.

[55] Gary S Bhumbra,et al. Measuring spike coding in the rat supraoptic nucleus , 2004, The Journal of physiology.

[56] Aaron D. Wyner,et al. On the Role of Pattern Matching in Information Theory , 1998, IEEE Trans. Inf. Theory.

[57] Frans M. J. Willems,et al. The Context-Tree Weighting Method : Extensions , 1998, IEEE Trans. Inf. Theory.

[58] Sanjeev R. Kulkarni,et al. Universal entropy estimation via block sorting , 2004, IEEE Transactions on Information Theory.

[59] P.A.J. Volf,et al. On the context tree maximizing algorithm , 1995, Proceedings of 1995 IEEE International Symposium on Information Theory.

[60] Igor Vajda,et al. Estimation of the Information by an Adaptive Partitioning of the Observation Space , 1999, IEEE Trans. Inf. Theory.