Proving and Disproving Information Inequalities: Theory and Scalable Algorithms

Proving or disproving an information inequality is a crucial step in establishing the converse results in coding theorems. However, an information inequality involving more than a few random variables is difficult to be proved or disproved manually. In 1997, Yeung developed a framework that uses linear programming for verifying linear information inequalities. Under the framework, this paper considers a few other problems that can be solved by using Lagrange duality and convex approximation. We will demonstrate how linear programming can be used to find an analytic proof of an information inequality or an analytic counterexample to disprove it if the inequality is not true in general. The way to automatically find a shortest proof or a smallest counterexample is explored. When a given information inequality cannot be proved, the sufficient conditions for a counterexample to disprove the information inequality are found by linear programming. Lastly, we propose a scalable algorithmic framework based on the alternating direction method of multipliers to accelerate solving a multitude of user-specific problems whose overall computational cost can be amortized with the number of users, and present its publicly-available software implementation for large-scale problems.

[1]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[2]  David Avis,et al.  A pivoting algorithm for convex hulls and vertex enumeration of arrangements and polyhedra , 1991, SCG '91.

[3]  A. Winter,et al.  Information causality as a physical principle , 2009, Nature.

[4]  Raymond W. Yeung,et al.  Information Theory and Network Coding , 2008 .

[5]  Frantisek Matús,et al.  Infinitely Many Information Inequalities , 2007, 2007 IEEE International Symposium on Information Theory.

[6]  Nikolai K. Vereshchagin,et al.  A new class of non-Shannon-type inequalities for entropies , 2002, Commun. Inf. Syst..

[7]  Robert E. Bixby,et al.  Recovering an optimal LP basis from an interior point solution , 1994, Oper. Res. Lett..

[8]  Yaakov Tsaig,et al.  Fast Solution of $\ell _{1}$ -Norm Minimization Problems When the Solution May Be Sparse , 2008, IEEE Transactions on Information Theory.

[9]  Zhen Zhang,et al.  A non-Shannon-type conditional inequality of information quantities , 1997, IEEE Trans. Inf. Theory.

[10]  Radim Lněnička,et al.  On the tightness of the Zhang-Yeung inequality for Gaussian vectors , 2003, Commun. Inf. Syst..

[11]  Michael A. Saunders,et al.  On projected newton barrier methods for linear programming and an equivalence to Karmarkar’s projective method , 1986, Math. Program..

[12]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[13]  Christian Majenz,et al.  Information–theoretic implications of quantum causal structures , 2014, Nature Communications.

[14]  Steven P. Weber,et al.  A Recursive Construction of the Set of Binary Entropy Vectors and Related Algorithmic Inner Bounds for the Entropy Region , 2009, IEEE Transactions on Information Theory.

[15]  Randall Dougherty,et al.  Six New Non-Shannon Information Inequalities , 2006, 2006 IEEE International Symposium on Information Theory.

[16]  Alex J. Grant,et al.  A minimal set of shannon-type inequalities for functional dependence structures , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[17]  Chee Wei Tan,et al.  Proving and disproving information inequalities , 2014, 2014 IEEE International Symposium on Information Theory.

[18]  Lin Ling,et al.  Scalable Automated Proving of Information Theoretic Inequalities with Proximal Algorithms , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[19]  Shang-Hua Teng,et al.  Smoothed analysis of algorithms: why the simplex algorithm usually takes polynomial time , 2001, STOC '01.

[20]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[21]  Hao Wang,et al.  Computer Theorem Proving and Artificial Intelligence , 1990 .

[22]  Satyajit Thakor,et al.  On Enumerating Distributions for Associated Vectors in the Entropy Space , 2018, 2018 International Symposium on Information Theory and Its Applications (ISITA).

[23]  Pablo A. Parrilo,et al.  Introducing SOSTOOLS: a general purpose sum of squares programming solver , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..

[24]  Kevin Skadron,et al.  Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[25]  Zhen Zhang,et al.  An Implicit Characterization of the Achievable Rate Region for Acyclic Multisource Multisink Network Coding , 2012, IEEE Transactions on Information Theory.

[26]  Congduan Li On Rate Region of Caching Problems With Non-Uniform File and Cache Sizes , 2017, IEEE Communications Letters.

[27]  D. Spielman,et al.  Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time , 2004 .

[28]  Rudolph van der Merwe,et al.  The square-root unscented Kalman filter for state and parameter-estimation , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[29]  Zhen Zhang,et al.  On Characterization of Entropy Function via Information Inequalities , 1998, IEEE Trans. Inf. Theory.

[30]  Roy E. Marsten,et al.  Implementation of a Dual Affine Interior Point Algorithm for Linear Programming , 1989, INFORMS J. Comput..

[31]  Zhen Zhang,et al.  Distributed Source Coding for Satellite Communications , 1999, IEEE Trans. Inf. Theory.

[32]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[33]  Konstantinos Benidis,et al.  Sparse Portfolios for High-Dimensional Financial Index Tracking , 2017, IEEE Transactions on Signal Processing.

[34]  John N. Tsitsiklis,et al.  Introduction to linear optimization , 1997, Athena scientific optimization and computation series.

[35]  Raymond W. Yeung,et al.  A framework for linear information inequalities , 1997, IEEE Trans. Inf. Theory.

[36]  David Avis,et al.  A pivoting algorithm for convex hulls and vertex enumeration of arrangements and polyhedra , 1992, Discret. Comput. Geom..

[37]  Chao Tian Characterizing the Rate Region of the (4,3,3) Exact-Repair Regenerating Codes , 2014, IEEE Journal on Selected Areas in Communications.

[38]  John MacLaren Walsh,et al.  On Multi-Source Networks: Enumeration, Rate Region Computation, and Hierarchy , 2017, IEEE Transactions on Information Theory.

[39]  Satyajit Thakor,et al.  Minimal Characterization of Shannon-Type Inequalities Under Functional Dependence and Full Conditional Independence Structures , 2019, IEEE Transactions on Information Theory.

[40]  Lin Ling,et al.  Automated Reasoning by Convex Optimization: Proof Simplicity, Duality and Sparsity , 2020, 2020 54th Annual Conference on Information Sciences and Systems (CISS).

[41]  Raymond W. Yeung,et al.  Finding the capacity of next-generation networks by linear programming , 2014, 2014 IEEE International Conference on Communication Systems.

[42]  John MacLaren Walsh,et al.  Multilevel Diversity Coding Systems: Rate Regions, Codes, Computation, & Forbidden Minors , 2014, IEEE Transactions on Information Theory.

[43]  Ao Tang,et al.  Cost of Not Splitting in Routing: Characterization and Estimation , 2011, IEEE/ACM Transactions on Networking.

[44]  Thomas C. Hales,et al.  Linear Programs for the Kepler Conjecture - (Extended Abstract) , 2010, ICMS.