Multi-Armed Bandit Bayesian Decision Making
暂无分享,去创建一个
[1] W. James. The Principles of Psychology, Vol. I , 2008 .
[2] John McCarthy,et al. A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence, August 31, 1955 , 2006, AI Mag..
[3] W. D. Penny,et al. Real-time brain-computer interfacing: A preliminary study using Bayesian learning , 2006, Medical and Biological Engineering and Computing.
[4] Mehryar Mohri,et al. Multi-armed Bandit Algorithms and Empirical Evaluation , 2005, ECML.
[5] L. Merabet,et al. The plastic human brain cortex. , 2005, Annual review of neuroscience.
[6] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[7] Martin J. Osborne,et al. An Introduction to Game Theory , 2003 .
[8] M. Tribus,et al. Probability theory: the logic of science , 2003 .
[9] Shie Mannor,et al. PAC Bounds for Multi-armed Bandit and Markov Decision Processes , 2002, COLT.
[10] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[11] Nicolò Cesa-Bianchi,et al. Finite-Time Regret Bounds for the Multiarmed Bandit Problem , 1998, ICML.
[12] David H. Wolpert,et al. Bandit problems and the exploration/exploitation tradeoff , 1998, IEEE Trans. Evol. Comput..
[13] Nicolò Cesa-Bianchi,et al. Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.
[14] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[15] R. McKelvey,et al. Quantal Response Equilibria for Normal Form Games , 1995 .
[16] John R. Kirby,et al. Intelligence and Social Policy. , 1995 .
[17] L. Kaelbling. Learning in embedded systems , 1993 .
[18] Manfred K. Warmuth,et al. The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.
[19] Jean Walrand,et al. Extensions of the multiarmed bandit problem: The discounted case , 1985 .
[20] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.
[21] J. Gittins,et al. A dynamic allocation index for the discounted multiarmed bandit problem , 1979 .
[22] J. Gittins. Bandit processes and dynamic allocation indices , 1979 .
[23] R. Duncan Luce,et al. Individual Choice Behavior , 1959 .
[24] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[25] J. Neumann,et al. Theory of games and economic behavior , 1945, 100 Years of Math Milestones.
[26] W. R. Thompson. On the Theory of Apportionment , 1935 .
[27] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[28] W. James. The principles of psychology , 1983 .