Conditional Rényi Divergence Saddlepoint and the Maximization of α-Mutual Information

Rényi-type generalizations of entropy, relative entropy and mutual information have found numerous applications throughout information theory and beyond. While there is consensus that the ways A. Rényi generalized entropy and relative entropy in 1961 are the “right” ones, several candidates have been put forth as possible mutual informations of order α. In this paper we lend further evidence to the notion that a Bayesian measure of statistical distinctness introduced by R. Sibson in 1969 (closely related to Gallager’s E0 function) is the most natural generalization, lending itself to explicit computation and maximization, as well as closed-form formulas. This paper considers general (not necessarily discrete) alphabets and extends the major analytical results on the saddle-point and saddle-level of the conditional relative entropy to the conditional Rényi divergence. Several examples illustrate the main application of these results, namely, the maximization of α-mutual information with and without constraints.

[1]  D. A. Bell,et al.  Information Theory and Reliable Communication , 1969 .

[2]  David Haussler,et al.  A general minimax result for relative entropy , 1997, IEEE Trans. Inf. Theory.

[3]  Rajesh Sundaresan,et al.  Guessing Under Source Uncertainty , 2006, IEEE Transactions on Information Theory.

[4]  Evgueni Haroutunian,et al.  Reliability Criteria in Information Theory and in Statistical Hypothesis Testing , 2008, Found. Trends Commun. Inf. Theory.

[5]  J. Kemperman On the Shannon capacity of an arbitrary channel , 1974 .

[6]  S. Verdú,et al.  Arimoto channel coding converse and Rényi divergence , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[7]  D. Luenberger Optimization by Vector Space Methods , 1968 .

[8]  Amos Lapidoth,et al.  Two Measures of Dependence , 2016, 2016 IEEE International Conference on the Science of Electrical Engineering (ICSEE).

[9]  Sergio Verdú,et al.  Convexity/concavity of renyi entropy and α-mutual information , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[10]  Amos Lapidoth,et al.  Encoding Tasks and Rényi Entropy , 2014, IEEE Transactions on Information Theory.

[11]  R. Sibson Information radius , 1969 .

[12]  Peter Harremoës,et al.  Rényi Divergence and Kullback-Leibler Divergence , 2012, IEEE Transactions on Information Theory.

[13]  Baris Nakiboglu,et al.  The Augustin Capacity and Center , 2018, Problems of Information Transmission.

[14]  N. Sloane,et al.  Lower Bounds to Error Probability for Coding on Discrete Memoryless Channels. I , 1993 .

[15]  Peter E. Latham,et al.  Mutual Information , 2006 .

[16]  Paul W. Cuff,et al.  Exact Soft-Covering Exponent , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[17]  Robert G. Gallager,et al.  A simple derivation of the coding theorem and some applications , 1965, IEEE Trans. Inf. Theory.

[18]  Simone Severini Channel Capacity , 2011 .

[19]  Boris Ryabko,et al.  Comments on 'A source matching approach to finding minimax codes' by L. D. Davisson and A. Leon-Garcia , 1981, IEEE Trans. Inf. Theory.

[20]  Peter Harremoes Interpretations of Renyi Entropies And Divergences , 2005 .

[21]  Imre Csiszár Generalized cutoff rates and Renyi's information measures , 1995, IEEE Trans. Inf. Theory.

[22]  Richard E. Blahut,et al.  Hypothesis testing and information theory , 1974, IEEE Trans. Inf. Theory.

[23]  Mokshay Madiman,et al.  Remarks on Rényi versions of conditional entropy and mutual information , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[24]  Suguru Arimoto,et al.  An algorithm for computing the capacity of arbitrary discrete memoryless channels , 1972, IEEE Trans. Inf. Theory.

[25]  Ram Zamir,et al.  The Ziv–Zakai–Rényi Bound for Joint Source-Channel Coding , 2015, IEEE Transactions on Information Theory.

[26]  Barış Nakiboğlu,et al.  The Rényi Capacity and Center , 2016, IEEE Transactions on Information Theory.

[27]  Alberto Leon-Garcia,et al.  A source matching approach to finding minimax codes , 1980, IEEE Trans. Inf. Theory.

[28]  Gustavo L. Gilardoni On Pinsker's and Vajda's Type Inequalities for Csiszár's $f$ -Divergences , 2006, IEEE Transactions on Information Theory.

[29]  Sergio Verdú,et al.  Improved Bounds on Lossless Source Coding and Guessing Moments via Rényi Measures , 2018, IEEE Transactions on Information Theory.

[30]  R. McEliece,et al.  Some Information Theoretic Saddlepoints , 1985 .

[31]  C. Tsallis Possible generalization of Boltzmann-Gibbs statistics , 1988 .

[32]  Semih Yagli,et al.  Exact Exponent for Soft Covering , 2018, IEEE Transactions on Information Theory.

[33]  Masahito Hayashi,et al.  Operational Interpretation of Rényi Information Measures via Composite Hypothesis Testing Against Product and Markov Distributions , 2015, IEEE Transactions on Information Theory.

[34]  Richard E. Blahut,et al.  Computation of channel capacity and rate-distortion functions , 1972, IEEE Trans. Inf. Theory.

[35]  F. Alajaji,et al.  Lectures Notes in Information Theory , 2000 .

[36]  Prakash Narayan,et al.  Reliable Communication Under Channel Uncertainty , 1998, IEEE Trans. Inf. Theory.

[37]  Ofer Shayevitz,et al.  On Rényi measures and hypothesis testing , 2011, 2011 IEEE International Symposium on Information Theory Proceedings.

[38]  Sergio Verdú,et al.  Minimax Rényi Redundancy , 2017, IEEE Transactions on Information Theory.

[39]  Claude E. Shannon,et al.  The zero error capacity of a noisy channel , 1956, IRE Trans. Inf. Theory.