Pairwise Maximum Entropy Models for Studying Large Biological Systems: When They Can Work and When They Can't

One of the most critical problems we face in the study of biological systems is building accurate statistical descriptions of them. This problem has been particularly challenging because biological systems typically contain large numbers of interacting elements, which precludes the use of standard brute force approaches. Recently, though, several groups have reported that there may be an alternate strategy. The reports show that reliable statistical models can be built without knowledge of all the interactions in a system; instead, pairwise interactions can suffice. These findings, however, are based on the analysis of small subsystems. Here, we ask whether the observations will generalize to systems of realistic size, that is, whether pairwise models will provide reliable descriptions of true biological systems. Our results show that, in most cases, they will not. The reason is that there is a crossover in the predictive power of pairwise models: If the size of the subsystem is below the crossover point, then the results have no predictive power for large systems. If the size is above the crossover point, then the results may have predictive power. This work thus provides a general framework for determining the extent to which pairwise models can be used to predict the behavior of large biological systems. Applied to neural data, the size of most systems studied so far is below the crossover point.

[1]  D. Mastronarde Correlated firing of cat retinal ganglion cells. I. Spontaneously active inputs to X- and Y-cells. , 1983, Journal of neurophysiology.

[2]  Richard W. Wrangham,et al.  Evolution of social structure , 1987 .

[3]  R. Ranganathan,et al.  Evolutionarily conserved pathways of energetic connectivity in protein families. , 1999, Science.

[4]  William Bialek,et al.  Spikes: Exploring the Neural Code , 1996 .

[5]  Eugene Lukacs,et al.  Selected Translations in Mathematical Statistics and Probability , 1964 .

[6]  Jonathon Shlens,et al.  The Structure of Multi-Neuron Firing Patterns in Primate Retina , 2006, The Journal of Neuroscience.

[7]  Michael J. Berry,et al.  Weak pairwise correlations imply strongly correlated network states in a neural population , 2005, Nature.

[8]  K. Dill Theory for the folding and stability of globular proteins. , 1985, Biochemistry.

[9]  T. Wiesel,et al.  Relationships between horizontal interactions and functional architecture in cat striate cortex as revealed by cross-correlation analysis , 1986, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[10]  John M. Beggs,et al.  A Maximum Entropy Model Applied to Spatial and Temporal Correlations from Cortical Networks In Vitro , 2008, The Journal of Neuroscience.

[11]  J. Eisenberg,et al.  The Relation between Ecology a Social Structure in Primates. , 1972, Science.

[12]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[13]  Jonathon Shlens,et al.  The Structure of Large-Scale Synchronized Firing in Primate Retina , 2009, The Journal of Neuroscience.

[14]  H. O. Lancaster The Structure of Bivariate Distributions , 1958 .

[15]  P. Latham,et al.  Retinal ganglion cells act largely as independent encoders , 2001, Nature.

[16]  Shun-ichi Amari,et al.  Measure of Correlation Orthogonal to Change in Firing Rate , 2009, Neural Computation.

[17]  Shan Yu,et al.  A Small World of Neuronal Synchrony , 2008, Cerebral cortex.

[18]  S. DeVries Correlated firing in rabbit retinal ganglion cells. , 1999, Journal of neurophysiology.

[19]  Matthias Bethge,et al.  Near-Maximum Entropy Models for Binary Neural Representations of Natural Images , 2007, NIPS.

[20]  John F. Oates,et al.  Food distribution and foraging behavior , 1987 .

[21]  R. Monasson,et al.  Small-correlation expansions for the inverse Ising problem , 2008, 0811.3574.

[22]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[23]  M. Jiménez-Montaño,et al.  A skewed distribution of amino acids at recognition sites of the hypervariable region of immunoglobulins , 2004, Journal of Molecular Evolution.

[24]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[25]  Paul Antoine Salin,et al.  Spatial and temporal coherence in cortico-cortical connections: a cross-correlation study in areas 17 and 18 in the cat. , 1992, Visual neuroscience.

[26]  Y. Dan,et al.  Coding of visual information by precisely correlated spikes in the lateral geniculate nucleus , 1998, Nature Neuroscience.

[27]  W. P. Russ,et al.  Evolutionary information for specifying a protein fold , 2005, Nature.

[28]  Ilan N Goodman,et al.  Inferring the capacity of the vector Poisson channel with a Bernoulli model , 2008, Network.

[29]  H. O. Lancaster Correlation and Complete Dependence of Random Variables , 1963 .

[30]  Monica A. Walker,et al.  Studies in Item Analysis and Prediction. , 1962 .

[31]  Naftali Tishby,et al.  Multivariate Information Bottleneck , 2001, Neural Computation.

[32]  W. P. Russ,et al.  Natural-like function in artificial WW domains , 2005, Nature.