Cram\'er-Rao Lower Bounds Arising from Generalized Csisz\'ar Divergences

We study the geometry of probability distributions with respect to a generalized family of Csiszar $f$-divergences. A member of this family is the relative $\alpha$-entropy which is also a Renyi analog of relative entropy in information theory and known as logarithmic or projective power divergence in statistics. We apply Eguchi's theory to derive the Fisher information metric and the dual affine connections arising from these generalized divergence functions. This enables us to arrive at a more widely applicable version of the Cramer-Rao inequality, which provides a lower bound for the variance of an estimator for an escort of the underlying parametric probability distribution. We then extend the Amari-Nagaoka's dually flat structure of the exponential and mixer models to other distributions with respect to the aforementioned generalized metric. We show that these formulations lead us to find unbiased and efficient estimators for the escort model. Finally, we compare our work with prior results on generalized Cramer-Rao inequalities that were derived from non-information-geometric frameworks.

[1]  Shinto Eguchi,et al.  Spontaneous Clustering via Minimum Gamma-Divergence , 2014, Neural Computation.

[2]  A. Basu,et al.  Statistical Inference: The Minimum Distance Approach , 2011 .

[3]  A. Rényi On Measures of Entropy and Information , 1961 .

[4]  E. Arıkan An inequality on guessing and its application to sequential decoding , 1995, Proceedings of 1995 IEEE International Symposium on Information Theory.

[5]  Amos Lapidoth,et al.  Codes for tasks and Rényi entropy rate , 2014, 2014 IEEE International Symposium on Information Theory.

[6]  Shogo Kato,et al.  Projective Power Entropy and Maximum Tsallis Entropy Distributions , 2011, Entropy.

[7]  Rajesh Sundaresan Guessing Under Source Uncertainty , 2006 .

[8]  Rajesh Sundaresan,et al.  Minimization Problems Based on Relative $\alpha $ -Entropy II: Reverse Projection , 2015, IEEE Transactions on Information Theory.

[9]  Jean-François Bercher,et al.  On generalized Cramér–Rao inequalities, generalized Fisher information and characterizations of generalized q-Gaussian distributions , 2012, ArXiv.

[10]  Shun-ichi Amari,et al.  Methods of information geometry , 2000 .

[11]  Kumar Vijay Mishra,et al.  Generalized Bayesian Cramér-Rao Inequality via Information Geometry of Relative α-Entropy , 2020, 2020 54th Annual Conference on Information Sciences and Systems (CISS).

[12]  Kumar Vijay Mishra,et al.  Information Geometric Approach to Bayesian Lower Error Bounds , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[13]  Andrzej Cichocki,et al.  Families of Alpha- Beta- and Gamma- Divergences: Flexible and Robust Measures of Similarities , 2010, Entropy.

[14]  Shogo Kato,et al.  Entropy and Divergence Associated with Power Function and the Statistical Application , 2010, Entropy.

[15]  Erwin Lutwak,et al.  Crame/spl acute/r-Rao and moment-entropy inequalities for Renyi entropy and generalized Fisher information , 2005, IEEE Transactions on Information Theory.

[16]  Shun-ichi Amari,et al.  Information Geometry and Its Applications , 2016 .

[17]  Peter Harremoës,et al.  Rényi Divergence and Kullback-Leibler Divergence , 2012, IEEE Transactions on Information Theory.

[18]  C. Tsallis,et al.  The role of constraints within generalized nonextensive statistics , 1998 .

[19]  I. Csiszár Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems , 1991 .

[20]  S. Amari,et al.  Information geometry of divergence functions , 2010 .

[21]  Erwin Lutwak,et al.  Extensions of Fisher Information and Stam's Inequality , 2012, IEEE Transactions on Information Theory.

[22]  M. C. Jones,et al.  A Comparison of related density-based minimum divergence estimators , 2001 .

[23]  S. Furuichi On the maximum entropy principle and the minimization of the Fisher information in Tsallis statistics , 2009, 1001.1383.

[24]  S. Eguchi Geometry of minimum contrast , 1992 .

[25]  L. L. Campbell,et al.  A Coding Theorem and Rényi's Entropy , 1965, Inf. Control..

[26]  Muriel Médard,et al.  Guessing with limited memory , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[27]  Anselm Blumer,et al.  The Rényi redundancy of generalized Huffman codes , 1988, IEEE Trans. Inf. Theory.

[28]  Jean-François Bercher,et al.  On a (β,q)-generalized Fisher information and inequalities involving q-Gaussian distributions , 2012, ArXiv.

[29]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[30]  J. Naudts Estimators, escort probabilities, and phi-exponential families in statistical physics , 2004, math-ph/0402005.

[31]  S. Eguchi,et al.  Robust parameter estimation with a small bias against heavy contamination , 2008 .

[32]  Igal Sason,et al.  Projection Theorems for the Rényi Divergence on $\alpha $ -Convex Sets , 2015, IEEE Transactions on Information Theory.

[33]  Rajesh Sundaresan,et al.  On The Equivalence of Projections in Relative (α- Entropy and Rényi Divergence , 2018, 2018 Twenty Fourth National Conference on Communications (NCC).

[34]  Jun Zhang,et al.  Divergence Function, Duality, and Convex Analysis , 2004, Neural Computation.

[35]  Rajesh Sundaresan,et al.  Minimization Problems Based on Relative $\alpha $ -Entropy I: Forward Projection , 2014, IEEE Transactions on Information Theory.