Hedged learning: regret-minimization with learning experts
暂无分享,去创建一个
[1] Nimrod Megiddo,et al. How to Combine Expert (and Novice) Advice when Actions Impact the Environment? , 2003, NIPS.
[2] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.
[3] Leslie Pack Kaelbling,et al. Playing is believing: The role of beliefs in multi-agent learning , 2001, NIPS.
[4] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[5] D. Fudenberg,et al. Consistency and Cautious Fictitious Play , 1995 .
[6] John Nachbar,et al. Non-computable strategies and discounted repeated games , 1996 .
[7] Shie Mannor,et al. Adaptive Strategies and Regret Minimization in Arbitrarily Varying Markov Environments , 2001, COLT/EuroCOLT.
[8] Nicolò Cesa-Bianchi,et al. Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.
[9] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[10] Y. Freund,et al. Adaptive game playing using multiplicative weights , 1999 .