Information-theoretic upper and lower bounds for statistical estimation

In this paper, we establish upper and lower bounds for some statistical estimation problems through concise information-theoretic arguments. Our upper bound analysis is based on a simple yet general inequality which we call the information exponential inequality. We show that this inequality naturally leads to a general randomized estimation method, for which performance upper bounds can be obtained. The lower bounds, applicable for all statistical estimators, are obtained by original applications of some well known information-theoretic inequalities, and approximately match the obtained upper bounds for various important problems. Moreover, our framework can be regarded as a natural generalization of the standard minimax framework, in that we allow the performance of the estimator to vary for different possible underlying distributions according to a predefined prior

[1]  Andrew R. Barron,et al.  Minimum complexity density estimation , 1991, IEEE Trans. Inf. Theory.

[2]  Ron Meir,et al.  Generalization Error Bounds for Bayesian Mixture Algorithms , 2003, J. Mach. Learn. Res..

[3]  S. R. Jammalamadaka,et al.  Empirical Processes in M-Estimation , 2001 .

[4]  J. Picard,et al.  Statistical learning theory and stochastic optimization : École d'eté de probabilités de Saint-Flour XXXI - 2001 , 2004 .

[5]  S. Geer Empirical Processes in M-Estimation , 2000 .

[6]  Richard E. Blahut Information bounds of the Fano-Kullback type , 1976, IEEE Trans. Inf. Theory.

[7]  Olivier Catoni,et al.  Statistical learning theory and stochastic optimization , 2004 .

[8]  Yuhong Yang,et al.  Information-theoretic determination of minimax rates of convergence , 1999 .

[9]  Tong Zhang,et al.  Learning Bounds for a Generalized Family of Bayesian Posterior Distributions , 2003, NIPS.

[10]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[11]  O. Catoni A PAC-Bayesian approach to adaptive classification , 2004 .

[12]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[13]  Sergio Verdú,et al.  Generalizing the Fano inequality , 1994, IEEE Trans. Inf. Theory.

[14]  Matthias W. Seeger,et al.  PAC-Bayesian Generalisation Error Bounds for Gaussian Process Classification , 2003, J. Mach. Learn. Res..

[15]  P. Massart,et al.  Rates of convergence for minimum contrast estimators , 1993 .

[16]  Tong Zhang From ɛ-entropy to KL-entropy: Analysis of minimum information complexity density estimation , 2006, math/0702653.

[17]  David A. McAllester PAC-Bayesian model averaging , 1999, COLT '99.