Information in the Nonstationary Case

Information estimates such as the direct method of Strong, Koberle, de Ruyter van Steveninck, and Bialek (1998) sidestep the difficult problem of estimating the joint distribution of response and stimulus by instead estimating the difference between the marginal and conditional entropies of the response. While this is an effective estimation strategy, it tempts the practitioner to ignore the role of the stimulus and the meaning of mutual information. We show here that as the number of trials increases indefinitely, the direct (or plug-in) estimate of marginal entropy converges (with probability 1) to the entropy of the time-averaged conditional distribution of the response, and the direct estimate of the conditional entropy converges to the time-averaged entropy of the conditional distribution of the response. Under joint stationarity and ergodicity of the response and stimulus, the difference of these quantities converges to the mutual information. When the stimulus is deterministic or nonstationary the direct estimate of information no longer estimates mutual information, which is no longer meaningful, but it remains a measure of variability of the response distribution across time.

[1]  Jonathon Shlens,et al.  Estimating Entropy Rates with Bayesian Confidence Intervals , 2005, Neural Computation.

[2]  Jan Beran,et al.  Statistics for long-memory processes , 1994 .

[3]  R. Reid,et al.  Temporal Coding of Visual Information in the Thalamus , 2000, The Journal of Neuroscience.

[4]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[5]  Bin Yu,et al.  Some statistical issues in estimating information in neural spike trains , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  F. Mechler,et al.  Formal and attribute-specific information in primary visual cortex. , 2001, Journal of neurophysiology.

[7]  H. Künsch Discrimination between monotonic trends and long-range dependence , 1986 .

[8]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[9]  M R DeWeese,et al.  How to measure the information gained from one symbol. , 1999, Network.

[10]  Sarah M. N. Woolley,et al.  Modulation Power and Phase Spectrum of Natural Sounds Enhance Neural Encoding Performed by Single Auditory Neurons , 2004, The Journal of Neuroscience.

[11]  Matthew A. Wilson,et al.  Dynamic Analyses of Information Encoding in Neural Ensembles , 2004, Neural Computation.

[12]  Yun Gao,et al.  From the Entropy to the Statistical Structure of Spike Trains , 2006, 2006 IEEE International Symposium on Information Theory.

[13]  A. Antos,et al.  Convergence properties of functional estimates for discrete distributions , 2001 .

[14]  P. Latham,et al.  Retinal ganglion cells act largely as independent encoders , 2001, Nature.

[15]  Jonathan D Victor,et al.  Approaches to Information-Theoretic Analysis of Neural Activity , 2006, Biological theory.

[16]  Bin Yu,et al.  Coverage-adjusted entropy estimation. , 2007, Statistics in medicine.

[17]  Alexander Borst,et al.  Information theory and neural coding , 1999, Nature Neuroscience.

[18]  William Bialek,et al.  Entropy and Information in Neural Spike Trains , 1996, cond-mat/9603127.