Counterfactual Data Augmentation using Locally Factored Dynamics
[1] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[2] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[3] Leslie Pack Kaelbling,et al. Learning to Achieve Goals , 1993, IJCAI.
[4] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[5] Craig Boutilier,et al. Exploiting Structure in Policy Construction , 1995, IJCAI.
[6] Craig Boutilier,et al. Context-Specific Independence in Bayesian Networks , 1996, UAI.
[7] Christopher G. Atkeson,et al. A comparison of direct and model-based reinforcement learning , 1997, Proceedings of International Conference on Robotics and Automation.
[8] Michael Kearns,et al. Efficient Reinforcement Learning in Factored MDPs , 1999, IJCAI.
[9] C. Granger. Investigating causal relations by econometric models and cross-spectral methods , 1969 .
[10] Tom Burr,et al. Causation, Prediction, and Search , 2003, Technometrics.
[11] Shobha Venkataraman,et al. Efficient Solution Algorithms for Factored MDPs , 2003, J. Artif. Intell. Res..
[12] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[13] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[14] Andre Cohen,et al. An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.
[15] J. Pearl. Causal inference in statistics: An overview , 2009 .
[16] A. Gelman. Causality and Statistical Learning , 2010 .
[17] A. Gelman. Causality and Statistical Learning1 , 2010, American Journal of Sociology.
[18] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[19] Elias Bareinboim,et al. Controlling Selection Bias in Causal Inference , 2011, AISTATS.
[20] Sergey Levine,et al. Offline policy evaluation across representations with applications to educational games , 2014, AAMAS.
[21] Jin Tian,et al. Recovering from Selection Bias in Causal and Statistical Inference , 2014, AAAI.
[22] Shie Mannor,et al. Off-policy Model-based Learning under Unknown Factored Dynamics , 2015, ICML.
[23] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[24] Hugo Larochelle,et al. MADE: Masked Autoencoder for Distribution Estimation , 2015, ICML.
[25] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[26] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[27] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[28] Kevin Gimpel,et al. Gaussian Error Linear Units (GELUs) , 2016 .
[29] Katja Hofmann,et al. The Malmo Platform for Artificial Intelligence Experimentation , 2016, IJCAI.
[30] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[31] Adam Tauman Kalai,et al. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.
[32] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[33] Philip S. Thomas,et al. Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning , 2016, ICML.
[34] Daniel Nikovski,et al. Value-Aware Loss Function for Model-based Reinforcement Learning , 2017, AISTATS.
[35] Luis Perez,et al. The Effectiveness of Data Augmentation in Image Classification using Deep Learning , 2017, ArXiv.
[36] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[37] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[38] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[39] Alexander J. Smola,et al. Deep Sets , 2017, 1703.06114.
[40] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.
[41] Bernhard Schölkopf,et al. Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .
[42] Sheila A. McIlraith,et al. Using Reward Machines for High-Level Task Specification and Decomposition in Reinforcement Learning , 2018, ICML.
[43] Pascale Fung,et al. Reducing Gender Bias in Abusive Language Detection , 2018, EMNLP.
[44] Alexandre M. Bayen,et al. Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines , 2018, ICLR.
[45] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[46] Sergey Levine,et al. Visual Reinforcement Learning with Imagined Goals , 2018, NeurIPS.
[47] Tianfu Wu,et al. ARCHER: Aggressive Rewards to Counter bias in Hindsight Experience Replay , 2018, ArXiv.
[48] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[49] Razvan Pascanu,et al. Relational Deep Reinforcement Learning , 2018, ArXiv.
[50] John C. Duchi,et al. Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.
[51] Chongjie Zhang,et al. Object-Oriented Dynamics Predictor , 2018, NeurIPS.
[52] Bernhard Schölkopf,et al. Deconfounding Reinforcement Learning in Observational Settings , 2018, ArXiv.
[53] Murray Shanahan,et al. SCAN: Learning Hierarchical Compositional Visual Concepts , 2017, ICLR.
[54] Yongxin Yang,et al. Learning to Generalize: Meta-Learning for Domain Generalization , 2017, AAAI.
[55] Yao Liu,et al. Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters , 2018, ArXiv.
[56] Marcin Andrychowicz,et al. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.
[57] Sebastian Nowozin,et al. Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift , 2019, NeurIPS.
[58] B. Hayden,et al. Monkeys are curious about counterfactual outcomes , 2018, Cognition.
[59] Fangkai Yang,et al. SDRL: Interpretable and Data-efficient Deep Reinforcement Learning Leveraging Symbolic Planning , 2018, AAAI.
[60] Pieter Abbeel,et al. Benchmarking Model-Based Reinforcement Learning , 2019, ArXiv.
[61] Yee Whye Teh,et al. Set Transformer , 2018, ICML.
[62] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[63] Ankur Taly,et al. Counterfactual Fairness in Text Classification through Robustness , 2018, AIES.
[64] Nicolas Heess,et al. Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search , 2018, ICLR.
[65] Bernhard Schölkopf,et al. Inferring causation from time series in Earth system sciences , 2019, Nature Communications.
[66] David Sontag,et al. Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models , 2019, ICML.
[67] Sergey Levine,et al. When to Trust Your Model: Model-Based Policy Optimization , 2019, NeurIPS.
[68] Bernhard Schölkopf,et al. Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.
[69] Alex Mott,et al. Towards Interpretable Reinforcement Learning Using Attention Augmented Agents , 2019, NeurIPS.
[70] Dana H. Brooks,et al. Structured Disentangled Representations , 2018, AISTATS.
[71] Sergey Levine,et al. Diagnosing Bottlenecks in Deep Q-learning Algorithms , 2019, ICML.
[72] David Silver,et al. Credit Assignment Techniques in Stochastic Computation Graphs , 2019, AISTATS.
[73] Alexander Lerchner,et al. COBRA: Data-Efficient Model-Based RL through Unsupervised Object Discovery and Curiosity-Driven Exploration , 2019, ArXiv.
[74] Matthew Botvinick,et al. MONet: Unsupervised Scene Decomposition and Representation , 2019, ArXiv.
[75] Taehoon Kim,et al. Quantifying Generalization in Reinforcement Learning , 2018, ICML.
[76] Allan Jabri,et al. Towards Practical Multi-Object Manipulation using Relational Reinforcement Learning , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).
[77] Ben Poole,et al. Weakly-Supervised Disentanglement Without Compromises , 2020, ICML.
[78] S. Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[79] Jimmy Ba,et al. Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning , 2020, ICML.
[80] P. Abbeel,et al. Reinforcement Learning with Augmented Data , 2020, NeurIPS.
[81] Fabio Viola,et al. Causally Correct Partial Models for Reinforcement Learning , 2020, ArXiv.
[82] Tristan Deleu,et al. Gradient-Based Neural DAG Learning , 2019, ICLR.
[83] Sergey Levine,et al. DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction , 2020, NeurIPS.
[84] Yannick Schroecker,et al. Universal Value Density Estimation for Imitation Learning and Goal-Conditioned Reinforcement Learning , 2020, ArXiv.
[85] Tim Miller,et al. Explainable Reinforcement Learning Through a Causal Lens , 2019, AAAI.
[86] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[87] Aysegul Dundar,et al. Unsupervised Disentanglement of Pose, Appearance and Background from Images and Videos , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[88] R. Fergus,et al. Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels , 2020, ICLR.
[89] Sergey Levine,et al. Recurrent Independent Mechanisms , 2019, ICLR.