暂无分享,去创建一个
Shane Legg | Tom Everitt | Jan Leike | David Krueger | Miljan Martic | Vishal Maini | S. Legg | David Krueger | Tom Everitt | J. Leike | Miljan Martic | Vishal Maini
[1] J. Zico Kolter,et al. Scaling provable adversarial defenses , 2018, NeurIPS.
[2] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[3] Dean Pomerleau,et al. Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.
[4] Wojciech Zaremba,et al. Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[5] Andrea Lockerd Thomaz,et al. Policy Shaping: Integrating Human Feedback with Reinforcement Learning , 2013, NIPS.
[6] Michael L. Littman,et al. Deep Reinforcement Learning from Policy-Dependent Human Feedback , 2019, ArXiv.
[7] Tor Lattimore,et al. Online Learning with Gated Linear Networks , 2017, ArXiv.
[8] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[9] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[10] Samuel J. Gershman,et al. Human-in-the-Loop Interpretability Prior , 2018, NeurIPS.
[11] Pushmeet Kohli,et al. Adversarial Risk and the Dangers of Evaluating Against Weak Attacks , 2018, ICML.
[12] Roman V. Yampolskiy,et al. Leakproofing the Singularity Artificial Intelligence Confinement Problem , 2012 .
[13] Pushmeet Kohli,et al. Learning to Follow Language Instructions with Adversarial Reward Induction , 2018, ArXiv.
[14] Dario Amodei,et al. Supervising strong learners by amplifying weak experts , 2018, ArXiv.
[15] Owain Evans,et al. Active Reinforcement Learning with Monte-Carlo Tree Search , 2018, ArXiv.
[16] Laurent Orseau,et al. AI Safety Gridworlds , 2017, ArXiv.
[17] Pieter Abbeel,et al. Third-Person Imitation Learning , 2017, ICLR.
[18] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.
[19] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.
[20] Michèle Sebag,et al. APRIL: Active Preference-learning based Reinforcement Learning , 2012, ECML/PKDD.
[21] Laurent Orseau,et al. Penalizing Side Effects using Stepwise Relative Reachability , 2018, AISafety@IJCAI.
[22] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[23] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[24] Tianqi Chen,et al. Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.
[25] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.
[26] Johannes Fürnkranz,et al. A Survey of Preference-Based Reinforcement Learning Methods , 2017, J. Mach. Learn. Res..
[27] Nando de Freitas,et al. Playing hard exploration games by watching YouTube , 2018, NeurIPS.
[28] Eyke Hüllermeier,et al. Preference-based reinforcement learning: a formal framework and a policy iteration algorithm , 2012, Mach. Learn..
[29] Stuart Armstrong,et al. Low Impact Artificial Intelligences , 2017, ArXiv.
[30] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[31] P. Samuelson. A Note on the Pure Theory of Consumer's Behaviour , 1938 .
[32] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[33] Anca D. Dragan,et al. Translating Neuralese , 2017, ACL.
[34] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[35] Jürgen Schmidhuber,et al. Goedel Machines: Self-Referential Universal Problem Solvers Making Provably Optimal Self-Improvements , 2003, ArXiv.
[36] Benja Fallenstein,et al. Robust Cooperation in the Prisoner's Dilemma: Program Equilibrium via Provability Logic , 2014, ArXiv.
[37] Nan Jiang,et al. Repeated Inverse Reinforcement Learning , 2017, NIPS.
[38] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[39] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.
[40] Jonathan Dodge,et al. Visualizing and Understanding Atari Agents , 2017, ICML.
[41] Amnon Shashua,et al. On a Formal Model of Safe and Scalable Self-driving Cars , 2017, ArXiv.
[42] Nicholas Carlini,et al. Unrestricted Adversarial Examples , 2018, ArXiv.
[43] John Salvatier,et al. Active Reinforcement Learning: Observing Rewards at a Cost , 2020, ArXiv.
[44] N. Soares,et al. Agent Foundations for Aligning Machine Intelligence with Human Interests: A Technical Research Agenda , 2017 .
[45] Lawrence D. Jackel,et al. Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.
[46] Shane Legg,et al. Universal Intelligence: A Definition of Machine Intelligence , 2007, Minds and Machines.
[47] Stephen M. Omohundro,et al. The Basic AI Drives , 2008, AGI.
[48] Pushmeet Kohli,et al. Learning to Understand Goal Specifications by Modelling Reward , 2018, ICLR.
[49] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[50] David A. Wagner,et al. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.
[51] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.
[52] Nick Bostrom,et al. Superintelligence: Paths, Dangers, Strategies , 2014 .
[53] Ryan P. Adams,et al. Motivating the Rules of the Game for Adversarial Example Research , 2018, ArXiv.
[54] Stefano Ermon,et al. Accurate Uncertainties for Deep Learning Using Calibrated Regression , 2018, ICML.
[55] David Cohn,et al. Active Learning , 2010, Encyclopedia of Machine Learning.
[56] Richard M. Karp,et al. Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.
[57] Risto Miikkulainen,et al. The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities , 2018, Artificial Life.
[58] Martin Wattenberg,et al. TCAV: Relative concept importance testing with Linear Concept Activation Vectors , 2018 .
[59] P. Samuelson. A Note on the Pure Theory of Consumer's Behaviour: An Addendum , 1938 .
[60] Laurent Orseau,et al. Measuring and avoiding side effects using relative reachability , 2018, ArXiv.
[61] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[62] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[63] Zoubin Ghahramani,et al. Deep Bayesian Active Learning with Image Data , 2017, ICML.
[64] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[65] Laurent Orseau,et al. Reinforcement Learning with a Corrupted Reward Channel , 2017, IJCAI.
[66] Pushmeet Kohli,et al. A Dual Approach to Scalable Verification of Deep Networks , 2018, UAI.
[67] Arvind Satyanarayan,et al. The Building Blocks of Interpretability , 2018 .
[68] Kilian Q. Weinberger,et al. On Calibration of Modern Neural Networks , 2017, ICML.
[69] Stuart J. Russell,et al. Research Priorities for Robust and Beneficial Artificial Intelligence , 2015, AI Mag..
[70] S. Schneider. Science fiction and philosophy : from time travel to superintelligence , 2016 .
[71] Marcus Hutter,et al. Bad Universal Priors and Notions of Optimality , 2015, COLT.
[72] Kevin Gimpel,et al. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.
[73] James Babcock,et al. The AGI Containment Problem , 2016, AGI.
[74] Laurent Orseau,et al. Delusion, Survival, and Intelligent Agents , 2011, AGI.
[75] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[76] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[77] Matthew W. Hoffman,et al. Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.
[78] Jessica Taylor,et al. A Formal Solution to the Grain of Truth Problem , 2016, UAI.
[79] Peter Stone,et al. Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces , 2017, AAAI.
[80] Oren Etzioni,et al. The First Law of Robotics (A Call to Arms) , 1994, AAAI.
[81] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.
[82] Pushmeet Kohli,et al. Training verified learners with learned verifiers , 2018, ArXiv.
[83] Tom McGrath,et al. FHI Oxford Technical Report # 2018-2 Predicting Human Deliberative Judgments with Machine Learning , 2018 .
[84] Nate Soares,et al. The Value Learning Problem , 2018, Artificial Intelligence Safety and Security.
[85] Aaron C. Courville,et al. Systematic Generalization: What Is Required and Can It Be Learned? , 2018, ICLR.
[86] Sanjeev Arora,et al. Computational Complexity: A Modern Approach , 2009 .
[87] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.
[88] Laurent Orseau,et al. Space-Time Embedded Intelligence , 2012, AGI.
[89] Armando Solar-Lezama,et al. Verifiable Reinforcement Learning via Policy Extraction , 2018, NeurIPS.
[90] Katerina Fragkiadaki,et al. Reward Learning from Narrated Demonstrations , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[91] Owain Evans,et al. Trial without Error: Towards Safe Reinforcement Learning via Human Intervention , 2017, AAMAS.
[92] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[93] Marcus Hutter,et al. AGI Safety Literature Review , 2018, IJCAI.
[94] Dr. Marcus Hutter,et al. Universal artificial intelligence , 2004 .
[95] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[96] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.
[97] Seth D. Baum,et al. Social choice ethics in artificial intelligence , 2017, AI & SOCIETY.
[98] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[99] Stuart Armstrong,et al. Motivated Value Selection for Artificial Agents , 2015, AAAI Workshop: AI and Ethics.
[100] Guan Wang,et al. Interactive Learning from Policy-Dependent Human Feedback , 2017, ICML.
[101] Romain Laroche,et al. Score-based Inverse Reinforcement Learning , 2016, AAMAS.
[102] Ariel D. Procaccia,et al. Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.
[103] Anca D. Dragan,et al. Should Robots be Obedient? , 2017, IJCAI.
[104] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[105] Peter Stone,et al. Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.
[106] Gintare Karolina Dziugaite,et al. Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data , 2017, UAI.
[107] Anca D. Dragan,et al. Inverse Reward Design , 2017, NIPS.
[108] Jessica Taylor,et al. Alignment for Advanced Machine Learning Systems , 2020, Ethics of Artificial Intelligence.
[109] Anca D. Dragan,et al. Cooperative Inverse Reinforcement Learning , 2016, NIPS.
[110] Martin Wattenberg,et al. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.
[111] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[112] David Barber,et al. Thinking Fast and Slow with Deep Learning and Tree Search , 2017, NIPS.
[113] Alex Kendall,et al. Concrete Dropout , 2017, NIPS.
[114] Marcus Hutter,et al. The Alignment Problem for Bayesian History-Based Reinforcement Learners∗ , 2019 .
[115] Chris Dyer,et al. Neural Arithmetic Logic Units , 2018, NeurIPS.
[116] Laurent Orseau,et al. Safely Interruptible Agents , 2016, UAI.
[117] B. Abramson. The expected-outcome model of two-player games , 1990 .
[118] Stefan Riezler,et al. Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning , 2018, ACL.
[119] Anca D. Dragan,et al. The Off-Switch Game , 2016, IJCAI.
[120] Alex Graves,et al. Practical Variational Inference for Neural Networks , 2011, NIPS.
[121] Dario Amodei,et al. AI safety via debate , 2018, ArXiv.
[122] Samy Bengio,et al. A Study on Overfitting in Deep Reinforcement Learning , 2018, ArXiv.
[123] Ryan P. Adams,et al. Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.
[124] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[125] Stuart Armstrong,et al. Occam's razor is insufficient to infer the preferences of irrational agents , 2017, NeurIPS.
[126] Daniel Dewey,et al. Learning What to Value , 2011, AGI.
[127] Laurent Orseau,et al. Asymptotic non-learnability of universal agents with computable horizon functions , 2013, Theor. Comput. Sci..
[128] Shie Mannor,et al. Graying the black box: Understanding DQNs , 2016, ICML.
[129] Neil D. Lawrence,et al. Dataset Shift in Machine Learning , 2009 .
[130] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[131] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[132] Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.
[133] Pushmeet Kohli,et al. Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures , 2018, ICLR.
[134] Mykel J. Kochenderfer,et al. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.
[135] Been Kim,et al. Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.
[136] N Wiener,et al. Some moral and technical consequences of automation , 1960, Science.
[137] Dan Klein,et al. Learning with Latent Language , 2017, NAACL.
[138] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[139] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.
[140] Alex Kendall,et al. What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.
[141] Julien Cornebise,et al. Weight Uncertainty in Neural Networks , 2015, ArXiv.
[142] Charles Blundell,et al. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.
[143] Mark O. Riedl,et al. Using Stories to Teach Human Values to Artificial Agents , 2016, AAAI Workshop: AI, Ethics, and Society.
[144] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[145] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[146] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[147] W. Bradley Knox,et al. Learning from human-generated reward , 2012 .
[148] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[149] Rishi Sharma,et al. A Note on the Inception Score , 2018, ArXiv.
[150] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[151] Benjamin Bruno Meier,et al. Deep Learning in the Wild , 2018, ANNPR.
[152] Andrea Lockerd Thomaz,et al. Teachable robots: Understanding human teaching behavior to build more effective robot learners , 2008, Artif. Intell..
[153] John Salvatier,et al. Agent-Agnostic Human-in-the-Loop Reinforcement Learning , 2017, ArXiv.
[154] James J. Little,et al. Does Your Model Know the Digit 6 Is Not a Cat? A Less Biased Evaluation of "Outlier" Detectors , 2018, ArXiv.
[155] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[156] Razvan Pascanu,et al. Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.
[157] Gary Marcus,et al. Deep Learning: A Critical Appraisal , 2018, ArXiv.
[158] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[159] Noah D. Goodman,et al. Learning the Preferences of Ignorant, Inconsistent Agents , 2015, AAAI.
[160] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[161] Edmund H. Durfee,et al. Minimax-Regret Querying on Side Effects for Safe Optimality in Factored Markov Decision Processes , 2018, IJCAI.
[162] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[163] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[164] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[165] Shane Legg,et al. Reward learning from human preferences and demonstrations in Atari , 2018, NeurIPS.
[166] Dawn Xiaodong Song,et al. Making Neural Programming Architectures Generalize via Recursion , 2017, ICLR.
[167] Tom Everitt,et al. Towards Safe Artificial General Intelligence , 2018 .
[168] Michèle Sebag,et al. Programming by Feedback , 2014, ICML.
[169] Sandy H. Huang,et al. Adversarial Attacks on Neural Network Policies , 2017, ICLR.