暂无分享,去创建一个
Shane Legg | Jordi Grau-Moya | Pedro A. Ortega | Tim Genewein | Miljan Martic | Vladimir Mikulik | Markus Kunesch | Gr'egoire D'eletang | Tom McGrath
[1] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[2] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[3] Shane Legg,et al. Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings , 2019, ArXiv.
[4] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[5] Jeremy Nixon,et al. Resolving Spurious Correlations in Causal Models of Environments via Interventions , 2020, ArXiv.
[6] D. Braddon-Mitchell. NATURE'S CAPACITIES AND THEIR MEASUREMENT , 1991 .
[7] Mélanie Frappier,et al. The Book of Why: The New Science of Cause and Effect , 2018, Science.
[8] Abhinav Verma,et al. Programmatically Interpretable Reinforcement Learning , 2018, ICML.
[9] Eric M. S. P. Veith,et al. Explainable Reinforcement Learning: A Survey , 2020, CD-MAKE.
[10] Alex Mott,et al. Towards Interpretable Reinforcement Learning Using Attention Augmented Agents , 2019, NeurIPS.
[11] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[12] A. Dawid. Causal Inference without Counterfactuals , 2000 .
[13] Daniel Polani,et al. Information Theory of Decisions and Actions , 2011 .
[14] Wojciech Samek,et al. Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..
[15] Marcin Andrychowicz,et al. Learning to learn by gradient descent by gradient descent , 2016, NIPS.
[16] PAUL J. WERBOS,et al. Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.
[17] Pushmeet Kohli,et al. Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures , 2018, ICLR.
[18] Viktor Mikhaĭlovich Glushkov,et al. An Introduction to Cybernetics , 1957, The Mathematical Gazette.
[19] Shane Legg,et al. The Incentives that Shape Behaviour , 2020, ArXiv.
[20] Tom Burr,et al. Causation, Prediction, and Search , 2003, Technometrics.
[21] A. Dawid,et al. Statistical Causality from a Decision-Theoretic Perspective , 2014, 1405.2292.
[22] Joseph Y. Halpern,et al. Actual causation and the art of modeling , 2011, ArXiv.
[23] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[24] Elias Bareinboim,et al. Bandits with Unobserved Confounders: A Causal Approach , 2015, NIPS.
[25] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[26] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[27] Marcus Hutter,et al. A Philosophical Treatise of Universal Induction , 2011, Entropy.
[28] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[29] Ilya Shpitser,et al. Counterfactual Graphical Models for Longitudinal Mediation Analysis With Unobserved Confounding , 2012, Cogn. Sci..
[30] Silvia Chiappa,et al. Path-Specific Counterfactual Fairness , 2018, AAAI.
[31] J. I. The Design of Experiments , 1936, Nature.
[32] Wilhelm Cauer,et al. Theorie der linearen Wechselstromschaltungen , 1940 .
[33] F. H. Adler. Cybernetics, or Control and Communication in the Animal and the Machine. , 1949 .
[34] Stuart J. Russell,et al. Research Priorities for Robust and Beneficial Artificial Intelligence , 2015, AI Mag..
[35] David Lopez-Paz,et al. Invariant Risk Minimization , 2019, ArXiv.
[36] J. Pearl,et al. Causal Inference in Statistics: A Primer , 2016 .
[37] Bram Bakker,et al. Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.
[38] J. Tenenbaum,et al. Structure and strength in causal induction , 2005, Cognitive Psychology.
[39] D. Olton. Mazes, maps, and memory. , 1979, The American psychologist.