A general minimax result for relative entropy

Suppose nature picks a probability measure P/sub /spl theta// on a complete separable metric space X at random from a measurable set P/sub /spl Theta//={P/spl theta/:/spl theta//spl isin//spl Theta/}. Then, without knowing /spl theta/, a statistician picks a measure Q on S. Finally, the statistician suffers a loss D(P/sub 0//spl par/Q), the relative entropy between P/sub /spl theta// and Q. We show that the minimax and maximin values of this game are always equal, and there is always a minimax strategy in the closure of the set of all Bayes strategies. This generalizes previous results of Gallager(1979), and Davisson and Leon-Garcia (1980).

[1]  Andrew R. Barron,et al.  A bound on the financial value of information , 1988, IEEE Trans. Inf. Theory.

[2]  Amiel Feinstein,et al.  Information and information stability of random variables and processes , 1964 .

[3]  D. Haussler,et al.  MUTUAL INFORMATION, METRIC ENTROPY, AND RISK IN ESTIMATION OF PROBABILITY DISTRIBUTIONS , 1996 .

[4]  A. Barron,et al.  Jeffreys' prior is asymptotically least favorable under entropy risk , 1994 .

[5]  Michael Kearns,et al.  Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[6]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[7]  Neri Merhav,et al.  A strong version of the redundancy-capacity theorem of universal coding , 1995, IEEE Trans. Inf. Theory.

[8]  David Haussler,et al.  General bounds on the mutual information between a parameter and n conditionally independent observations , 1995, COLT '95.

[9]  Alberto Leon-Garcia,et al.  A source matching approach to finding minimax codes , 1980, IEEE Trans. Inf. Theory.

[10]  Lee D. Davisson,et al.  Universal noiseless coding , 1973, IEEE Trans. Inf. Theory.

[11]  Jürgen Krob,et al.  A Minimax Result for the Kullback Leibler Bayes Risk , 1997 .

[12]  David Haussler,et al.  Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension , 1991, COLT '91.

[13]  David Haussler,et al.  HOW WELL DO BAYES METHODS WORK FOR ON-LINE PREDICTION OF {+- 1} VALUES? , 1992 .

[14]  Gerald S. Rogers,et al.  Mathematical Statistics: A Decision Theoretic Approach , 1967 .

[15]  D. Haussler,et al.  MUTUAL INFORMATION, METRIC ENTROPY AND CUMULATIVE RELATIVE ENTROPY RISK , 1997 .

[16]  Andrew R. Barron,et al.  Information-theoretic asymptotics of Bayes methods , 1990, IEEE Trans. Inf. Theory.

[17]  L. Lecam An Extension of Wald's Theory of Statistical Decision Functions , 1955 .

[18]  Edward C. Posner,et al.  Random coding strategies for minimum entropy , 1975, IEEE Trans. Inf. Theory.