The Partial Entropy Decomposition: Decomposing multivariate entropy and mutual information via pointwise common surprisal

Obtaining meaningful quantitative descriptions of the statistical dependence within multivariate systems is a difficult open problem. Recently, the Partial Information Decomposition (PID) was proposed to decompose mutual information (MI) about a target variable into components which are redundant, unique and synergistic within different subsets of predictor variables. Here, we propose to apply the elegant formalism of the PID to multivariate entropy, resulting in a Partial Entropy Decomposition (PED). We implement the PED with an entropy redundancy measure based on pointwise common surprisal; a natural definition which is closely related to the definition of MI. We show how this approach can reveal the dyadic vs triadic generative structure of multivariate systems that are indistinguishable with classical Shannon measures. The entropy perspective also shows that misinformation is synergistic entropy and hence that MI itself includes both redundant and synergistic effects. We show the relationships between the PED and MI in two predictors, and derive two alternative information decompositions which we illustrate on several example systems. This reveals that in entropy terms, univariate predictor MI is not a proper subset of the joint MI, and we suggest this previously unrecognised fact explains in part why obtaining a consistent PID has proven difficult. The PED also allows separate quantification of mechanistic redundancy (related to the function of the system) versus source redundancy (arising from dependencies between inputs); an important distinction which no existing methods can address. The new perspective provided by the PED helps to clarify some of the difficulties encountered with the PID approach and the resulting decompositions provide useful tools for practical data analysis across a wide range of application areas.

[1]  Matsuda,et al.  Physical nature of higher-order mutual information: intrinsic correlations and frustration , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[2]  George Loizou,et al.  The completion of a poset in a lattice of antichains , 2001 .

[3]  Marian Verhelst,et al.  Understanding Interdependency Through Complex Information Sharing , 2015, Entropy.

[4]  Viola Priesemann,et al.  Local active information storage as a tool to understand distributed neural information processing , 2013, Front. Neuroinform..

[5]  Robin A. A. Ince Measuring multivariate redundant information with pointwise common change in surprisal , 2016, Entropy.

[6]  Christof Koch,et al.  Quantifying synergistic mutual information , 2012, ArXiv.

[7]  Adam B. Barrett,et al.  An exploration of synergistic and redundant information sharing in static and dynamical Gaussian systems , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Eckehard Olbrich,et al.  Information Decomposition and Synergy , 2015, Entropy.

[9]  Masud Mansuripur,et al.  Introduction to information theory , 1986 .

[10]  Randall D. Beer,et al.  Nonnegative Decomposition of Multivariate Information , 2010, ArXiv.

[11]  A. J. Bell THE CO-INFORMATION LATTICE , 2003 .

[12]  Daniel Chicharro,et al.  Synergy and Redundancy in Dual Decompositions of Mutual Information Gain and Information Loss , 2016, Entropy.

[13]  William J. McGill Multivariate information transmission , 1954, Trans. IRE Prof. Group Inf. Theory.

[14]  J. Jenkins,et al.  Word association norms , 1964 .

[15]  Raymond W. Yeung,et al.  A new outlook of Shannon's information measures , 1991, IEEE Trans. Inf. Theory.

[16]  Virgil Griffith,et al.  Synergy, Redundancy and Common Information , 2015, ArXiv.

[17]  R. M. Fano,et al.  The statistical theory of information , 1959 .

[18]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[19]  Benjamin Flecker,et al.  Synergy, redundancy, and multivariate information measures: an experimentalist’s perspective , 2014, Journal of Computational Neuroscience.

[20]  Albert Y. Zomaya,et al.  Local information transfer as a spatiotemporal filter for complex systems. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[22]  Pradeep Kr. Banerjee Some new insights into information decomposition in complex systems based on common information , 2015, ArXiv.

[23]  Viola Priesemann,et al.  Bits from Brains for Biologically Inspired Computing , 2014, Front. Robot. AI.

[24]  Eckehard Olbrich,et al.  Shared Information -- New Insights and Problems in Decomposing Information in Complex Systems , 2012, ArXiv.

[25]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[26]  Tim van de Cruys Two Multivariate Generalizations of Pointwise Mutual Information , 2011, Proceedings of the Workshop on Distributional Semantics and Compositionality.

[27]  Ivan Bratko,et al.  Quantifying and Visualizing Attribute Interactions , 2003, ArXiv.

[28]  Eckehard Olbrich,et al.  On extractable shared information , 2017, Entropy.

[29]  Christoph Salge,et al.  A Bivariate Measure of Redundant Information , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  James P. Crutchfield,et al.  Multivariate Dependence Beyond Shannon Information , 2016, Entropy.

[31]  James P. Crutchfield,et al.  Intersection Information Based on Common Randomness , 2013, Entropy.

[32]  Eckehard Olbrich,et al.  Reconsidering unique information: Towards a multivariate information decomposition , 2014, 2014 IEEE International Symposium on Information Theory.

[33]  Jim Kay,et al.  Partial information decomposition as a unified approach to the specification of neural goal functions , 2015, Brain and Cognition.

[34]  Guillaume A. Rousselet,et al.  A statistical framework for neuroimaging data analysis based on mutual information estimated via a gaussian copula , 2016, bioRxiv.

[35]  T. V. D. Cruys Two multivariate generalizations of pointwise mutual information , 2011 .

[36]  Eckehard Olbrich,et al.  Quantifying unique information , 2013, Entropy.