The strong law of large numbers for sequential decisions under uncertainty

Combines optimization and ergodic theory to characterize the optimum long-run average performance that can be asymptotically attained by nonanticipating sequential decisions. Let {X/sub t/} be a stationary ergodic process, and suppose an action b/sub t/ must be selected in a space /spl Bscr/ with knowledge of the t-past (X/sub 0/, /spl middot//spl middot//spl middot/, X/sub t-1/) at the beginning of every period t/spl ges/0. Action b/sub t/ will incur a loss l(b/sub t/, X/sub t/) at the end of period t when the random variable X/sub t/ is revealed. The author proves under mild integrability conditions that the optimum strategy is to select actions that minimize the conditional expected loss given the currently available information at each step. The minimum long-run average loss per decision can be approached arbitrarily closely by strategies that are finite-order Markov, and under certain continuity conditions, it is equal to the minimum expected loss given the infinite past. If the loss l(b, x) is bounded and continuous and if the space /spl Bscr/ is compact, then the minimum can be asymptotically attained, even if the distribution of the process {X/sub t/} is unknown a priori and must be learned from experience. >

[1]  N. Wiener The ergodic theorem , 1939 .

[2]  K. Chung Note on Some Strong Laws of Large Numbers , 1947 .

[3]  A. Wald Note on the Consistency of the Maximum Likelihood Estimate , 1949 .

[4]  K. Chung The Strong Law of Large Numbers , 1951 .

[5]  H. Robbins,et al.  Asymptotic Solutions of the Compound Decision Problem for Two Completely Specified Distributions , 1955 .

[6]  H. Robbins An Empirical Bayes Approach to Statistics , 1956 .

[7]  V. Fabian,et al.  Experience in statistical decision problems , 1956 .

[8]  John L. Kelly,et al.  A new interpretation of information rate , 1956, IRE Trans. Inf. Theory.

[9]  D. Blackwell An analog of the minimax theorem for vector payoffs. , 1956 .

[10]  L. Breiman The Individual Ergodic Theorem of Information Theory , 1957 .

[11]  L. Breiman Correction Notes: Correction to "The Individual Ergodic Theorem of Information Theory" , 1960 .

[12]  L. Breiman Optimal Gambling Systems for Favorable Games , 1962 .

[13]  D. Burkholder Successive Conditional Expectations of an Integrable Function , 1962 .

[14]  Jerzy Neyman,et al.  Two Breakthroughs in the Theory of Statistical Decision Making , 1962 .

[15]  D. Blackwell,et al.  A converse to the dominated convergence theorem , 1963 .

[16]  E. Samuel Asymptotic Solutions of the Sequential Compound Decision Problem , 1963 .

[17]  Michel Loève,et al.  Probability Theory I , 1977 .

[18]  E. Samuel Convergence of the Losses of Certain Decision Rules for the Sequential Compound Decision Problem , 1964 .

[19]  H. Robbins The Empirical Bayes Approach to Statistical Decision Problems , 1964 .

[20]  P. J. Huber Robust Estimation of a Location Parameter , 1964 .

[21]  D. D. Swain Bounds and rates of convergence for the extended compound estimation problem in the sequence case. , 1965 .

[22]  Thomas M. Cover,et al.  Behavior of sequential predictors of binary sequences , 1965 .

[23]  Y. Chow Local Convergence of Martingales and the Law of Large Numbers , 1965 .

[24]  J. Van Ryzin,et al.  Rate of Convergence in the Compound Decision Problem for Two Completely Specified Distributions , 1965 .

[25]  E. Samuel Sequential Compound Estimators , 1965 .

[26]  T. Andô,et al.  Almost everywhere convergence of prediction sequence in Lp (1 < p < ∞) , 1965 .

[27]  J. V. Ryzin,et al.  The Sequential Compound Decision Problem with $m \times n$ Finite Loss Matrix , 1966 .

[28]  M. Rao Inference in stochastic processes. II , 1966 .

[29]  J. V. Ryzin,et al.  Repetitive Play in Finite Statistical Games with Unknown Distributions , 1966 .

[30]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[31]  M. Rao Inference in stochastic processes-III , 1967 .

[32]  M. Johns Two-action compound decision problems , 1967 .

[33]  Y. Chow On a Strong Law of Large Numbers for Martingales , 1967 .

[34]  Stanislav Jílovec,et al.  Repetitive play of a game against nature , 1967 .

[35]  D. Gilliland Sequential Compound Estimation , 1968 .

[36]  J. Hannan,et al.  On an Extended Compound Decision Problem , 1969 .

[37]  Dennis Gilliland Approximation to Bayes Risk in Sequences of Non-finite Games , 1969 .

[38]  B. Shubert Bayesian Model of Decision-Making as a Result of Learning From Experience , 1969 .

[39]  M. Rao Abstract nonlinear prediction and operator martingales , 1971 .

[40]  Dennis Crippen Gilliland,et al.  Asymptotic risk stability resulting from play against the past in a sequence of decision problems , 1972, IEEE Trans. Inf. Theory.

[41]  W. Stout Almost sure convergence , 1974 .

[42]  T. Andô,et al.  Best approximants in L1 space , 1975 .

[43]  Thomas M. Cover,et al.  Compound Bayes Predictors for Sequences with Apparent Markov Structure , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[44]  Merrilee Kathryn Helmers,et al.  On Continuity of the Bayes Response , 1978 .

[45]  D. Ornstein Guessing the next output of a stationary process , 1978 .

[46]  Dennis Crippen Gilliland,et al.  On continuity of the Bayes response (Corresp.) , 1978, IEEE Trans. Inf. Theory.

[47]  N. Herrndorf Counterexamples to results of M.M. Rao , 1980 .

[48]  D. Landers,et al.  Best approximants in LΦ-spaces , 1980 .

[49]  Harald Sverdrup-Thygeson Strong Law of Large Numbers for Measures of Central Tendency and Dispersion of Random Variables in Compact Metric Spaces , 1981 .

[50]  N. Herrndorf Best Φ- and NΦ-approximants in Orlicz spaces of vector valued functions , 1981 .

[51]  D. Pollard Strong Consistency of $K$-Means Clustering , 1981 .

[52]  H. White Maximum Likelihood Estimation of Misspecified Models , 1982 .

[53]  D. Pollard A Central Limit Theorem for $k$-Means Clustering , 1982 .

[54]  David Pollard,et al.  Quantization and the method of k -means , 1982, IEEE Trans. Inf. Theory.

[55]  L. Fernholz von Mises Calculus For Statistical Functionals , 1983 .

[56]  C. Gouriéroux,et al.  PSEUDO MAXIMUM LIKELIHOOD METHODS: THEORY , 1984 .

[57]  A. P. Dawid,et al.  Present position and potential developments: some personal views , 1984 .

[58]  A. Barron THE STRONG ERGODIC THEOREM FOR DENSITIES: GENERALIZED SHANNON-MCMILLAN-BREIMAN THEOREM' , 1985 .

[59]  D. C. Taylor Asymptotic distribution theory for general statistical functionals , 1985 .

[60]  H. Robbins Asymptotically Subminimax Solutions of Compound Statistical Decision Problems , 1985 .

[61]  W. Esty,et al.  Asymptotic distribution theory of statistical functionals: The compact derivative approach for robust estimators , 1985 .

[62]  R. Gill Non- and semi-parametric maximum likelihood estimators and the Von Mises method , 1986 .

[63]  J. Rissanen Stochastic Complexity and Modeling , 1986 .

[64]  Thomas M. Cover,et al.  Empirical Bayes stock market portfolios , 1986 .

[65]  Ergodic Process Selection , 1987 .

[66]  T. Cover Ergodic Process Selection , 1987 .

[67]  T. Cover,et al.  Asymptotic optimality and asymptotic equipartition properties of log-optimum investment , 1988 .

[68]  T. Cover,et al.  Game-theoretic optimal portfolios , 1988 .

[69]  J. A. Cuesta,et al.  The strong law of large numbers for k-means and best possible nets of Banach valued random variables , 1988 .

[70]  J. Dupacová,et al.  ASYMPTOTIC BEHAVIOR OF STATISTICAL ESTIMATORS AND OF OPTIMAL SOLUTIONS OF STOCHASTIC OPTIMIZATION PROBLEMS , 1988 .

[71]  T. Cover,et al.  A sandwich proof of the Shannon-McMillan-Breiman theorem , 1988 .

[72]  R. Gray Source Coding Theory , 1989 .

[73]  J. Kieffer An ergodic theorem for constrained sequences of functions , 1989 .

[74]  S. Haberman Concavity and estimation , 1989 .

[75]  J. Kieffer An Almost Sure Convergence Theorem For Sequences of Random Variables Selected From Log-Convex Sets , 1991 .

[76]  T. Lai Information bounds, certainty equivalence and learning in asymptotically efficient adaptive control of time-invariant stochastic systems , 1991 .

[77]  Wojciech Niemiro Asymptotics for M-estimators defined by convex minimization , 1992 .

[78]  P. Algoet UNIVERSAL SCHEMES FOR PREDICTION, GAMBLING AND PORTFOLIO SELECTION' , 1992 .

[79]  Paul C. Shields,et al.  Universal redundancy rates do not exist , 1993, IEEE Trans. Inf. Theory.

[80]  Neri Merhav,et al.  Some properties of sequential predictors for binary Markov sources , 1993, IEEE Trans. Inf. Theory.