暂无分享,去创建一个
[1] E. Altman. Constrained Markov Decision Processes , 1999 .
[2] Sarit Kraus,et al. Towards a formalization of teamwork with resource constraints , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[3] Fritz Wysotzki,et al. Risk-Sensitive Reinforcement Learning Applied to Control under Constraints , 2005, J. Artif. Intell. Res..
[4] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[5] Sylvie Thiébaux,et al. RAO*: An Algorithm for Chance-Constrained POMDP's , 2016, AAAI.
[6] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[7] Jonathan P. How,et al. An online algorithm for constrained POMDPs , 2010, 2010 IEEE International Conference on Robotics and Automation.
[8] S. Marcus,et al. Approximate receding horizon approach for Markov decision processes: average reward case , 2003 .
[9] Masahiro Ono,et al. Joint chance-constrained dynamic programming , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[10] Johannes Bisschop,et al. AIMMS - Optimization Modeling , 2006 .
[11] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[12] Daniel P. Heyman,et al. Stochastic models in operations research , 1982 .
[13] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[14] L. Rossman. Reliability‐constrained dynamic programing and randomized release rules in reservoir management , 1977 .
[15] Raymond L. Smith,et al. Rolling Horizon Procedures in Nonhomogeneous Markov Decision Processes , 1992, Oper. Res..
[16] Masahiro Ono,et al. Paper Summary: Probabilistic Planning for Continuous Dynamic Systems under Bounded Risk , 2013, ICAPS.
[17] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[18] J. Bather,et al. Multi‐Armed Bandit Allocation Indices , 1990 .
[19] Jason D. Williams. Decision Theory Models for Applications in Artificial Intelligence: Concepts and Solutions , 2011 .
[20] Edmund H. Durfee,et al. Stationary Deterministic Policies for Constrained MDPs with Multiple Rewards, Costs, and Discount Factors , 2005, IJCAI.
[21] F. B. Hildebrand,et al. Introduction To Numerical Analysis , 1957 .