Information in the Nonstationary Case

Information estimates such as the ``direct method'' of Strong et al. (1998) sidestep the difficult problem of estimating the joint distribution of response and stimulus by instead estimating the difference between the marginal and conditional entropies of the response. While this is an effective estimation strategy, it tempts the practitioner to ignore the role of the stimulus and the meaning of mutual information. We show here that, as the number of trials increases indefinitely, the direct (or ``plug-in'') estimate of marginal entropy converges (with probability 1) to the entropy of the time-averaged conditional distribution of the response, and the direct estimate of the conditional entropy converges to the time-averaged entropy of the conditional distribution of the response. Under joint stationarity and ergodicity of the response and stimulus, the difference of these quantities converges to the mutual information. When the stimulus is deterministic or non-stationary the direct estimate of information no longer estimates mutual information, which is no longer meaningful, but it remains a measure of variability of the response distribution across time.

[1]  William Bialek,et al.  Entropy and Information in Neural Spike Trains , 1996, cond-mat/9603127.

[2]  M R DeWeese,et al.  How to measure the information gained from one symbol. , 1999, Network.

[3]  Sarah M. N. Woolley,et al.  Modulation Power and Phase Spectrum of Natural Sounds Enhance Neural Encoding Performed by Single Auditory Neurons , 2004, The Journal of Neuroscience.

[4]  Matthew A. Wilson,et al.  Dynamic Analyses of Information Encoding in Neural Ensembles , 2004, Neural Computation.

[5]  Jonathon Shlens,et al.  Estimating Entropy Rates with Bayesian Confidence Intervals , 2005, Neural Computation.

[6]  P. Latham,et al.  Retinal ganglion cells act largely as independent encoders , 2001, Nature.

[7]  Jonathan D Victor,et al.  Approaches to Information-Theoretic Analysis of Neural Activity , 2006, Biological theory.

[8]  H. Künsch Discrimination between monotonic trends and long-range dependence , 1986 .

[9]  Jan Beran,et al.  Statistics for long-memory processes , 1994 .

[10]  Yun Gao,et al.  From the Entropy to the Statistical Structure of Spike Trains , 2006, 2006 IEEE International Symposium on Information Theory.

[11]  F. Mechler,et al.  Formal and attribute-specific information in primary visual cortex. , 2001, Journal of neurophysiology.

[12]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[13]  Bin Yu,et al.  Coverage-adjusted entropy estimation. , 2007, Statistics in medicine.

[14]  Alexander Borst,et al.  Information theory and neural coding , 1999, Nature Neuroscience.

[15]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[16]  R. Reid,et al.  Temporal Coding of Visual Information in the Thalamus , 2000, The Journal of Neuroscience.

[17]  A. Antos,et al.  Convergence properties of functional estimates for discrete distributions , 2001 .