论文信息 - Targeted Opponent Modeling of Memory-Bounded Agents - 字舞流文

Targeted Opponent Modeling of Memory-Bounded Agents

Doran Chakraborty and Peter Stone | D. A. Stone

[1] Noa Agmon,et al. Leading ad hoc agents in joint action settings with multiple teammates , 2012, AAMAS.

[2] Milind Tambe,et al. Towards Optimal Patrol Strategies for Fare Inspection in Transit Systems , 2012, AAAI Spring Symposium: Game Theory for Security, Sustainability, and Health.

[3] Sarit Kraus,et al. Multi-Robot Adversarial Patrolling: Facing a Full-Knowledge Opponent , 2011, J. Artif. Intell. Res..

[4] Peter Stone,et al. Convergence, Targeted Optimality, and Safety in Multiagent Learning , 2010, ICML.

[5] Lynne E. Parker,et al. A fault-tolerant modular control approach to multi-robot perimeter patrol , 2009, 2009 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[6] Yoav Shoham,et al. Learning against opponents with bounded memory , 2005, IJCAI.

[7] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[8] Andrew G. Barto,et al. Reinforcement learning , 1998 .

[9] David H. Wolpert,et al. No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[10] W. Hoeffding. Probability inequalities for sum of bounded random variables , 1963 .