暂无分享,去创建一个
[1] S. Marcus,et al. Risk sensitive control of Markov processes in countable state space , 1996 .
[2] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[3] Daniel Hernández-Hernández,et al. Risk Sensitive Markov Decision Processes , 1997 .
[4] D. Bertsekas. Gradient convergence in gradient methods , 1997 .
[5] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[6] E. Altman. Constrained Markov Decision Processes , 1999 .
[7] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[8] John N. Tsitsiklis,et al. A survey of computational complexity results in systems and control , 2000, Autom..
[9] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[10] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[11] John Langford,et al. Exploration in Metric State Spaces , 2003, ICML.
[12] A. McEwen,et al. Mars Reconnaissance Orbiter's High Resolution Imaging Science Experiment (HiRISE) , 2007 .
[13] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 1998, Machine Learning.
[14] Michael L. Littman,et al. A theoretical analysis of Model-Based Interval Estimation , 2005, ICML.
[15] Laurent El Ghaoui,et al. Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..
[16] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[17] MSL Landing Site Selection User ’ s Guide to Engineering Constraints , 2006 .
[18] M. Lockwood. Introduction: Mars Science Laboratory: The Next Generation of Mars Landers , 2006 .
[19] Shie Mannor,et al. Percentile optimization in uncertain Markov decision processes with application to efficient exploration , 2007, ICML '07.
[20] Thomas J. Walsh,et al. Knows what it knows: a framework for self-aware learning , 2008, ICML '08.
[21] Steffen Udluft,et al. Safe exploration for reinforcement learning , 2008, ESANN.
[22] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.
[23] Pieter Abbeel,et al. Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..
[24] Alborz Geramifard,et al. UAV cooperative control with stochastic risk models , 2011, Proceedings of the 2011 American Control Conference.
[25] John N. Tsitsiklis,et al. Mean-Variance Optimization in Markov Decision Processes , 2011, ICML.
[26] Claire J. Tomlin,et al. Guaranteed safe online learning of a bounded system , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[27] Claire J. Tomlin,et al. Extensions of learning-based model predictive control for real-time application to a quadrotor helicopter , 2012, 2012 American Control Conference (ACC).
[28] Klaus Obermayer,et al. Risk-Sensitive Reinforcement Learning , 2013, Neural Computation.