Safely Interruptible Agents
暂无分享,去创建一个
[1] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[2] Satinder Singh,et al. An Upper Bound on the Loss from Approximate Optimal-Value Functions , 2004, Machine-mediated learning.
[3] Mark Humphrys. Action Selection in a hypothetical house robot: Using those RL numbers , 1996 .
[4] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[5] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[6] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[7] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.
[8] Marcus Hutter. Simulation Algorithms for Computational Systems Biology , 2017, Texts in Theoretical Computer Science. An EATCS Series.
[9] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[10] Stephen M. Omohundro,et al. The Basic AI Drives , 2008, AGI.
[11] Jürgen Schmidhuber,et al. Optimal Direct Policy Search , 2011, AGI.
[12] Tor Lattimore,et al. Asymptotically Optimal Agents , 2011, ALT.
[13] Jürgen Schmidhuber,et al. Artificial General Intelligence - 4th International Conference, AGI 2011, Mountain View, CA, USA, August 3-6, 2011. Proceedings , 2011, AGI.
[14] Laurent Orseau,et al. Delusion, Survival, and Intelligent Agents , 2011, AGI.
[15] Dr. Tom Murphy. The First Level of Super Mario Bros . is Easy with Lexicographic Orderings and Time Travel , 2013 .
[16] Laurent Orseau,et al. Asymptotic non-learnability of universal agents with computable horizon functions , 2013, Theor. Comput. Sci..
[17] Tor Lattimore,et al. Bayesian Reinforcement Learning with Exploration , 2014, ALT.
[18] Tomás Svoboda,et al. Safe Exploration Techniques for Reinforcement Learning - An Overview , 2014, MESAS.
[19] Nick Bostrom,et al. Superintelligence: Paths, Dangers, Strategies , 2014 .
[20] Jan Hodicky,et al. Modelling and Simulation for Autonomous Systems , 2014, Lecture Notes in Computer Science.
[21] Marcus Hutter,et al. Bad Universal Priors and Notions of Optimality , 2015, COLT.
[22] James Babcock,et al. Artificial General Intelligence , 2016, Lecture Notes in Computer Science.