Targeted Opponent Modeling of Memory-Bounded Agents
暂无分享,去创建一个
[1] Noa Agmon,et al. Leading ad hoc agents in joint action settings with multiple teammates , 2012, AAMAS.
[2] Milind Tambe,et al. Towards Optimal Patrol Strategies for Fare Inspection in Transit Systems , 2012, AAAI Spring Symposium: Game Theory for Security, Sustainability, and Health.
[3] Sarit Kraus,et al. Multi-Robot Adversarial Patrolling: Facing a Full-Knowledge Opponent , 2011, J. Artif. Intell. Res..
[4] Peter Stone,et al. Convergence, Targeted Optimality, and Safety in Multiagent Learning , 2010, ICML.
[5] Lynne E. Parker,et al. A fault-tolerant modular control approach to multi-robot perimeter patrol , 2009, 2009 IEEE International Conference on Robotics and Biomimetics (ROBIO).
[6] Yoav Shoham,et al. Learning against opponents with bounded memory , 2005, IJCAI.
[7] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[8] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[9] David H. Wolpert,et al. No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..
[10] W. Hoeffding. Probability inequalities for sum of bounded random variables , 1963 .