The apparent conflict between estimation and control—a survey of the two-armed bandit problem

[1]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[2]  W. R. Thompson On the Theory of Apportionment , 1935 .

[3]  R. Bellman A PROBLEM IN THE SEQUENTIAL DESIGN OF EXPERIMENTS , 1954 .

[4]  H Robbins,et al.  A SEQUENTIAL DECISION PROBLEM WITH A FINITE MEMORY. , 1956, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Frederick Mosteller,et al.  Stochastic Models for Learning , 1956 .

[6]  R. N. Bradt,et al.  On Sequential Designs for Maximizing the Sum of $n$ Observations , 1956 .

[7]  J. Isbell On a Problem of Robbins , 1959 .

[8]  John McCarthy,et al.  Programs with common sense , 1960 .

[9]  W. Vogel A Sequential Design for the Two Armed Bandit , 1960 .

[10]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[11]  Dorian Feldman Contributions to the "Two-Armed Bandit" Problem , 1962 .

[12]  Ά.Á. Feldbâum Dual control theory problems , 1963 .

[13]  Carter Smith,et al.  The Robbins-Isbell Two-Armed-Bandit Problem with Finite Memory , 1965 .

[14]  D. D. Sworder,et al.  A study of the relationship between identification and optimization in adaptive control problems , 1966 .

[15]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[16]  S. M. Samuels Randomized Rules for the Two-Armed-Bandit with Finite Memory , 1968 .

[17]  Thomas M. Cover,et al.  A Note on the Two-Armed Bandit Problem with Finite Memory , 1968, Inf. Control..

[18]  T. Cover Hypothesis Testing with Finite Statistics , 1969 .

[19]  Thomas M. Cover,et al.  Finite-memory hypothesis testing-Comments on a critique (Corresp.) , 1970, IEEE Transactions on Information Theory.

[20]  T. Cover,et al.  Learning with Finite Memory , 1970 .

[21]  Thomas M. Cover,et al.  The two-armed-bandit problem with time-invariant finite memory , 1970, IEEE Trans. Inf. Theory.

[22]  B. Chandrasekaran Finite-memory hypothesis testing-A critique (Corresp.) , 1970, IEEE Trans. Inf. Theory.

[23]  B. Chandrasekaran,et al.  Reply to 'Finite memory hypothesis testing-Comments on a critique' by Cover, T. M., and Hellman, M. E , 1971, IEEE Trans. Inf. Theory.

[24]  T. Cover,et al.  On Memory Saved by Randomization , 1971 .

[25]  Martin E. Hellman,et al.  The effects of randomization on finite-memory decision schemes , 1972, IEEE Trans. Inf. Theory.

[26]  Martin E. Hellman,et al.  Hypothesis testing with finite memory in finite time (Corresp.) , 1972, IEEE Trans. Inf. Theory.

[27]  Ian H. Witten Finite-Time Performance of Some Two-Armed Bandit Controllers , 1973, IEEE Trans. Syst. Man Cybern..

[28]  Ian H. Witten,et al.  Human operators and automatic adaptive controllers: A comparative study on a particular control task , 1973 .

[29]  B. Shubert,et al.  Testing a simple symmetric hypothesis by a finite-memory deterministic algorithm , 1973, IEEE Trans. Inf. Theory.

[30]  Kumpati S. Narendra,et al.  Learning Automata - A Survey , 1974, IEEE Trans. Syst. Man Cybern..

[31]  H. Robbins Some aspects of the sequential design of experiments , 1952 .