Decomposing information into copying versus transformation

In many real-world systems, information can be transmitted in two qualitatively different ways: by copying or by transformation. Copying occurs when messages are transmitted without modification, for example when an offspring receives an unaltered copy of a gene from its parent. Transformation occurs when messages are modified in a systematic way during transmission, e.g., when non-random mutations occur during biological reproduction. Standard information-theoretic measures of information transmission, such as mutual information, do not distinguish these two modes of information transfer, even though they may reflect different mechanisms and have different functional consequences. We propose a decomposition of mutual information which separately quantifies the information transmitted by copying versus the information transmitted by transformation. Our measures of copy and transformation information are derived from a few simple axioms, and have natural operationalizations in terms of hypothesis testing and thermodynamics. In this later case, we show that our measure of copy information corresponds to the minimal amount of work needed by a physical copying process, having special relevance for the physics of replication of biological information. We demonstrate our measures on a real world dataset of amino acid substitution rates. Our decomposition into copy and transformation information is general and applies to any system in which the fidelity of copying, rather than simple predictability, is of critical relevance.

[1]  N. Kampen,et al.  Stochastic processes in physics and chemistry , 1981 .

[2]  Shun-ichi Amari,et al.  Unified framework for information integration based on information geometry , 2015, Proceedings of the National Academy of Sciences.

[3]  M. Esposito,et al.  Three faces of the second law. I. Master equation formulation. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  R. Hanel,et al.  Three faces of entropy for complex systems: Information, thermodynamics, and the maximum entropy principle. , 2017, Physical review. E.

[5]  Artemy Kolchinsky,et al.  Semantic information, autonomous agency and non-equilibrium statistical physics , 2018, Interface Focus.

[6]  Alexander J. Smola,et al.  Unifying Divergence Minimization and Statistical Inference Via Convex Duality , 2006, COLT.

[7]  Mariela D. Petkova,et al.  Optimal Decoding of Cellular Identities in a Genetic Network , 2016, Cell.

[8]  R. Johnson,et al.  Properties of cross-entropy minimization , 1981, IEEE Trans. Inf. Theory.

[9]  T. Ouldridge,et al.  Fundamental Costs in the Production and Destruction of Persistent Polymer Copies. , 2016, Physical review letters.

[10]  J. Bernardo,et al.  THE FORMAL DEFINITION OF REFERENCE PRIORS , 2009, 0904.0156.

[11]  P. Grassberger,et al.  Sequence Alignment, Mutual Information, and Dissimilarity Measures for Constructing Phylogenies , 2010, PloS one.

[12]  Stefan Thurner,et al.  The three faces of entropy for complex systems – information , thermodynamics and the maxent principle , 2017 .

[13]  B. E. Wright,et al.  A Biochemical Mechanism for Nonrandom Mutations and Evolution , 2000, Journal of bacteriology.

[14]  R. Hanel,et al.  A comprehensive classification of complex statistical systems and an axiomatic derivation of their entropy and distribution functions , 2010, 1005.0138.

[15]  P. R. ten Wolde,et al.  Nonequilibrium correlations in minimal dynamical models of polymer copying , 2018, Proceedings of the National Academy of Sciences.

[16]  A. B. Sørensen Mathematical Models in Sociology , 1978 .

[17]  Carl T. Bergstrom,et al.  The transmission sense of information , 2008, 0810.4168.

[18]  Hidde de Jong,et al.  Modeling and Simulation of Genetic Regulatory Systems: A Literature Review , 2002, J. Comput. Biol..

[19]  D. Graur Amino acid composition and the evolutionary rates of protein-coding genes , 2005, Journal of Molecular Evolution.

[20]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[21]  Z. Yang,et al.  Models of amino acid substitution and applications to mitochondrial protein evolution. , 1998, Molecular biology and evolution.

[22]  J. N. Kapur,et al.  Entropy Optimization Principles and Their Applications , 1992 .

[23]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[24]  Daniel A Butts,et al.  How much information is associated with a particular stimulus? , 2003, Network.

[25]  S. Miyazawa,et al.  Two types of amino acid substitutions in protein evolution , 1979, Journal of Molecular Evolution.

[26]  M R DeWeese,et al.  How to measure the information gained from one symbol. , 1999, Network.

[27]  J. Pierce An introduction to information theory: symbols, signals & noise , 1980 .

[28]  Cristina Marino Buslje,et al.  MISTIC: mutual information server to infer coevolution , 2013, Nucleic Acids Res..

[29]  Evgueni A. Haroutunian,et al.  Information Theory and Statistics , 2011, International Encyclopedia of Statistical Science.

[30]  Partha Niyogi,et al.  Optimizing the mutual intelligibility of linguistic agents in a shared world , 2004, Artif. Intell..

[31]  K. Chaloner,et al.  Bayesian Experimental Design: A Review , 1995 .

[32]  John L. Kelly,et al.  A new interpretation of information rate , 1956, IRE Trans. Inf. Theory.

[33]  L. Steels Evolving grounded communication for robots , 2003, Trends in Cognitive Sciences.

[34]  M. N. Bera,et al.  Thermodynamics from Information , 2018, 1805.10282.

[35]  Erik Ordentlich,et al.  Universal portfolios with side information , 1996, IEEE Trans. Inf. Theory.

[36]  Pierre Baldi,et al.  Bayesian surprise attracts human attention , 2005, Vision Research.

[37]  Stephanie J. Spielman,et al.  Pyvolve: A Flexible Python Module for Simulating Sequences along Phylogenies , 2015, bioRxiv.

[38]  Arun K. Ramani,et al.  Exploiting the co-evolution of interacting proteins to discover interaction specificity. , 2003, Journal of molecular biology.

[39]  P. Marler,et al.  Vervet monkey alarm calls: Semantic communication in a free-ranging primate , 1980, Animal Behaviour.

[40]  R. Grantham Amino Acid Difference Formula to Help Explain Protein Evolution , 1974, Science.

[41]  Bernat Corominas-Murtra,et al.  Thermodynamics of Duplication Thresholds in Synthetic Protocell Systems , 2018, Life.

[42]  Carl T. Bergstrom,et al.  The fitness value of information , 2005, Oikos.

[43]  P. Sneath Relations between chemical structure and biological activity in peptides. , 1966, Journal of theoretical biology.

[44]  O. Gascuel,et al.  An improved general amino acid replacement matrix. , 2008, Molecular biology and evolution.

[45]  Randall D. Beer,et al.  Nonnegative Decomposition of Multivariate Information , 2010, ArXiv.

[46]  Ziheng Yang Estimating the pattern of nucleotide substitution , 1994, Journal of Molecular Evolution.

[47]  Fernando E. Rosas,et al.  Beyond integrated information: A taxonomy of information dynamics phenomena , 2019, 1909.02297.

[48]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[49]  N. Ay,et al.  Complexity measures from interaction structures. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[50]  Viola Priesemann,et al.  Bits from Brains for Biologically Inspired Computing , 2014, Front. Robot. AI.

[51]  D. Lindley On a Measure of the Information Provided by an Experiment , 1956 .

[52]  Naftali Tishby,et al.  The minimum information principle and its application to neural code analysis , 2009, Proceedings of the National Academy of Sciences.

[53]  James R. Hurford,et al.  Biological evolution of the Saussurean sign as a component of the language acquisition device , 1989 .

[54]  J. Bernardo Expected Information as Expected Utility , 1979 .

[55]  Schreiber,et al.  Measuring information transfer , 2000, Physical review letters.

[56]  O. Koyejo,et al.  A Representation Approach for Relative Entropy Minimization with Expectation Constraints , 2013 .

[57]  J. Hopfield Physics, Computation, and Why Biology Looks so Different , 1994 .

[58]  Andrew R. Barron,et al.  A bound on the financial value of information , 1988, IEEE Trans. Inf. Theory.

[59]  Imre Csiszár,et al.  Information Theory - Coding Theorems for Discrete Memoryless Systems, Second Edition , 2011 .

[60]  Partha Niyogi,et al.  Book Reviews: The Computational Nature of Language Learning and Evolution, by Partha Niyogi , 2007, CL.

[61]  L. Goddard Information Theory , 1962, Nature.

[62]  J. Bernardo Reference Posterior Distributions for Bayesian Inference , 1979 .

[63]  Miroslav Dudík,et al.  Maximum Entropy Distribution Estimation with Generalized Regularization , 2006, COLT.

[64]  Zhou Wang,et al.  Modern Image Quality Assessment , 2006, Modern Image Quality Assessment.

[65]  Wen-Hsiung Li,et al.  Nonrandomness of point mutation as reflected in nucleotide substitutions in pseudogenes and its evolutionary implications , 2005, Journal of Molecular Evolution.

[66]  Aaron D. Wyner,et al.  Coding Theorems for a Discrete Source With a Fidelity CriterionInstitute of Radio Engineers, International Convention Record, vol. 7, 1959. , 1993 .

[67]  Masahito Ueda,et al.  Second law of thermodynamics with discrete quantum feedback control. , 2007, Physical review letters.

[68]  W. Strange Evolution of language. , 1984, JAMA.

[69]  Fazlollah M. Reza,et al.  Introduction to Information Theory , 2004, Lecture Notes in Electrical Engineering.

[70]  Bernat Corominas-Murtra,et al.  Towards a mathematical theory of meaningful communication , 2010, Scientific Reports.

[71]  Angelo Cangelosi,et al.  Simulating the Evolution of Language , 2002, Springer London.

[72]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[73]  M. Stone Application of a Measure of Information to the Design and Comparison of Regression Experiments , 1959 .