Measuring multivariate redundant information with pointwise common change in surprisal

The problem of how to properly quantify redundant information is an open question that has been the subject of much recent research. Redundant information refers to information about a target variable S that is common to two or more predictor variables Xi. It can be thought of as quantifying overlapping information content or similarities in the representation of S between the Xi. We present a new measure of redundancy which measures the common change in surprisal shared between variables at the local or pointwise level. We provide a game-theoretic operational definition of unique information, and use this to derive constraints which are used to obtain a maximum entropy distribution. Redundancy is then calculated from this maximum entropy distribution by counting only those local co-information terms which admit an unambiguous interpretation as redundant information. We show how this redundancy measure can be used within the framework of the Partial Information Decomposition (PID) to give an intuitive decomposition of the multivariate mutual information into redundant, unique and synergistic contributions. We compare our new measure to existing approaches over a range of example systems, including continuous Gaussian variables. Matlab code for the measure is provided, including all considered examples.

[1]  George Loizou,et al.  The completion of a poset in a lattice of antichains , 2001 .

[2]  S. Panzeri,et al.  An exact method to quantify the information transmitted by different mechanisms of correlational coding. , 2003, Network.

[3]  Benjamin Flecker,et al.  Synergy, redundancy, and multivariate information measures: an experimentalist’s perspective , 2014, Journal of Computational Neuroscience.

[4]  M R DeWeese,et al.  How to measure the information gained from one symbol. , 1999, Network.

[5]  Robin A. A. Ince The Partial Entropy Decomposition: Decomposing multivariate entropy and mutual information via pointwise common surprisal , 2017, ArXiv.

[6]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[7]  Eckehard Olbrich,et al.  Quantifying unique information , 2013, Entropy.

[8]  Albert Y. Zomaya,et al.  A framework for the local information dynamics of distributed computation in complex systems , 2008, ArXiv.

[9]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[10]  Johannes Rauh,et al.  Secret Sharing and Shared Information , 2017, Entropy.

[11]  E T Rolls,et al.  Correlations and the encoding of information in the nervous system , 1999, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[12]  A. J. Bell THE CO-INFORMATION LATTICE , 2003 .

[13]  Michael J. Berry,et al.  Network information and connected correlations. , 2003, Physical review letters.

[14]  Daniel Chicharro,et al.  Synergy and Redundancy in Dual Decompositions of Mutual Information Gain and Information Loss , 2016, Entropy.

[15]  Mark D. Plumbley,et al.  A measure of statistical complexity based on predictive information , 2010, ArXiv.

[16]  Christof Koch,et al.  Quantifying synergistic mutual information , 2012, ArXiv.

[17]  Albert Y. Zomaya,et al.  Local information transfer as a spatiotemporal filter for complex systems. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Tim van de Cruys Two Multivariate Generalizations of Pointwise Mutual Information , 2011, Proceedings of the Workshop on Distributional Semantics and Compositionality.

[19]  S. Dehaene,et al.  Characterizing the dynamics of mental representations: the temporal generalization method , 2014, Trends in Cognitive Sciences.

[20]  TJ Gawne,et al.  How independent are the messages carried by adjacent inferior temporal cortical neurons? , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[21]  Daniel A Butts,et al.  How much information is associated with a particular stimulus? , 2003, Network.

[22]  William Bialek,et al.  Synergy in a Neural Code , 2000, Neural Computation.

[23]  Viola Priesemann,et al.  Local active information storage as a tool to understand distributed neural information processing , 2013, Front. Neuroinform..

[24]  Eckehard Olbrich,et al.  Information Decomposition and Synergy , 2015, Entropy.

[25]  Andreas Bartels,et al.  A novel test to determine the significance of neural selectivity to single and multiple potentially correlated stimulus features , 2012, Journal of Neuroscience Methods.

[26]  Randall D. Beer,et al.  Nonnegative Decomposition of Multivariate Information , 2010, ArXiv.

[27]  Christoph Salge,et al.  A Bivariate Measure of Redundant Information , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  Peter E. Latham,et al.  Pairwise Maximum Entropy Models for Studying Large Biological Systems: When They Can Work and When They Can't , 2008, PLoS Comput. Biol..

[29]  James P. Crutchfield,et al.  Multivariate Dependence Beyond Shannon Information , 2016, Entropy.

[30]  James P. Crutchfield,et al.  Intersection Information Based on Common Randomness , 2013, Entropy.

[31]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[32]  Daniel Chicharro,et al.  A Causal Perspective on the Analysis of Signal and Noise Correlations and Their Role in Population Coding , 2014, Neural Computation.

[33]  Eckehard Olbrich,et al.  Reconsidering unique information: Towards a multivariate information decomposition , 2014, 2014 IEEE International Symposium on Information Theory.

[34]  William J. McGill Multivariate information transmission , 1954, Trans. IRE Prof. Group Inf. Theory.

[35]  Christian Borgelt,et al.  Computational Intelligence , 2016, Texts in Computer Science.

[36]  J. Jenkins,et al.  Word association norms , 1964 .

[37]  P. Latham,et al.  Synergy, Redundancy, and Independence in Population Codes, Revisited , 2005, The Journal of Neuroscience.

[38]  Joseph T. Lizier,et al.  Towards a synergy-based approach to measuring information modification , 2013, 2013 IEEE Symposium on Artificial Life (ALife).

[39]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[40]  Matsuda,et al.  Physical nature of higher-order mutual information: intrinsic correlations and frustration , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[41]  Stefano Panzeri,et al.  Correcting for the sampling bias problem in spike train information measures. , 2007, Journal of neurophysiology.

[42]  Adam B. Barrett,et al.  An exploration of synergistic and redundant information sharing in static and dynamical Gaussian systems , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[43]  R. Quiroga,et al.  Extracting information from neuronal populations : information theory and decoding approaches , 2022 .

[44]  Eckehard Olbrich,et al.  Shared Information -- New Insights and Problems in Decomposing Information in Complex Systems , 2012, ArXiv.

[45]  Te Sun Han,et al.  Multiple Mutual Informations and Multiple Interactions in Frequency Data , 1980, Inf. Control..

[46]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[47]  Nikolaus Kriegeskorte,et al.  Frontiers in Systems Neuroscience Systems Neuroscience , 2022 .

[48]  S. Amari Information Geometry of Multiple Spike Trains , 2010 .

[49]  A. P. Beltyukov,et al.  On the amount of information , 2011, Pattern Recognition and Image Analysis.

[50]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[51]  Robin A. A. Ince,et al.  On the presence of high-order interactions among somatosensory neurons and their effect on information transmission , 2009 .

[52]  Tracey Ho,et al.  Quantifying Redundant Information in Predicting a Target Random Variable , 2014, Entropy.

[53]  Michael J. Berry,et al.  Synergy, Redundancy, and Independence in Population Codes , 2003, The Journal of Neuroscience.

[54]  Jim Kay,et al.  Partial information decomposition as a unified approach to the specification of neural goal functions , 2015, Brain and Cognition.

[55]  Guillaume A. Rousselet,et al.  A statistical framework for neuroimaging data analysis based on mutual information estimated via a gaussian copula , 2016, bioRxiv.

[56]  Fazlollah M. Reza,et al.  Introduction to Information Theory , 2004, Lecture Notes in Electrical Engineering.

[57]  Ivan Bratko,et al.  Quantifying and Visualizing Attribute Interactions , 2003, ArXiv.