暂无分享,去创建一个
Tom Schaul | Simon Osindero | Gregory Farquhar | Feryal Behbahani | Angelos Filos | Diana Borsa | Andr'e Barreto | Eszter V'ertes | Zita Marinho | Abram Friesen
[1] Pravin Varaiya,et al. Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .
[2] Friedhelm Schwenker,et al. Neural Network Ensembles in Reinforcement Learning , 2013, Neural Processing Letters.
[3] Shimon Whiteson,et al. A theoretical and empirical analysis of Expected Sarsa , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[4] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[5] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[6] Rishabh Agarwal,et al. Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation , 2021, AAAI.
[7] Wojciech Jaskowski,et al. Model-Based Active Exploration , 2018, ICML.
[8] Shimon Whiteson,et al. TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning , 2017, ICLR.
[9] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[10] 拓海 杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .
[11] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[12] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[13] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[14] Xiao Ma,et al. Contrastive Variational Model-Based Reinforcement Learning for Complex Observations , 2020, ArXiv.
[15] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[16] J. Schulman,et al. Leveraging Procedural Generation to Benchmark Reinforcement Learning , 2019, ICML.
[17] John D. Hunter,et al. Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.
[18] Martin A. Riedmiller,et al. Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models , 2019, CoRL.
[19] Jasper Snoek,et al. Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors , 2020, ICML.
[20] Julien Cornebise,et al. Weight Uncertainty in Neural Network , 2015, ICML.
[21] Charles Blundell,et al. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.
[22] Kilian Q. Weinberger,et al. Deep Networks with Stochastic Depth , 2016, ECCV.
[23] Jos'e Miguel Hern'andez-Lobato,et al. Depth Uncertainty in Neural Networks , 2020, NeurIPS.
[24] Tian Tian,et al. MinAtar: An Atari-Inspired Testbed for Thorough and Reproducible Reinforcement Learning Experiments , 2019 .
[25] Kilian Q. Weinberger,et al. Snapshot Ensembles: Train 1, get M for free , 2017, ICLR.
[26] Pieter Abbeel,et al. Model-Ensemble Trust-Region Policy Optimization , 2018, ICLR.
[27] Zheng Wen,et al. Reinforcement Learning, Bit by Bit , 2021, Found. Trends Mach. Learn..
[28] Honglak Lee,et al. Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion , 2018, NeurIPS.
[29] Sergey Levine,et al. Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model , 2019, NeurIPS.
[30] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[31] Yarin Gal,et al. Generalizing from a few environments in safety-critical reinforcement learning , 2019, ArXiv.
[32] J. Schreiber. Foundations Of Statistics , 2016 .
[33] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[34] David Silver,et al. Muesli: Combining Improvements in Policy Optimization , 2021, ICML.
[35] Manfred Morari,et al. Model predictive control: Theory and practice - A survey , 1989, Autom..
[36] Jasper Snoek,et al. Hyperparameter Ensembles for Robustness and Uncertainty Quantification , 2020, NeurIPS.
[37] Marcus Hutter. Simulation Algorithms for Computational Systems Biology , 2017, Texts in Theoretical Computer Science. An EATCS Series.
[38] Andrew Gordon Wilson,et al. A Simple Baseline for Bayesian Uncertainty in Deep Learning , 2019, NeurIPS.
[39] Aaron van den Oord,et al. Shaping Belief States with Generative Environment Models for RL , 2019, NeurIPS.
[40] Zita Marinho,et al. Self-Consistent Models and Values , 2021, NeurIPS.
[41] Satinder Singh,et al. The Value Equivalence Principle for Model-Based Reinforcement Learning , 2020, NeurIPS.
[42] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[43] Jörg D. Wichard,et al. Building Ensembles with Heterogeneous Models , 2003 .
[44] David Andre,et al. Model based Bayesian Exploration , 1999, UAI.
[45] Tao Yu,et al. PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning , 2021, NeurIPS.
[46] Mohammad Norouzi,et al. An Optimistic Perspective on Offline Reinforcement Learning , 2020, ICML.
[47] Deepak Pathak,et al. Self-Supervised Exploration via Disagreement , 2019, ICML.
[48] Tim Pearce,et al. Uncertainty in Neural Networks: Approximately Bayesian Ensembling , 2018, AISTATS.
[49] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[50] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.
[51] Nahum Shimkin,et al. Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning , 2016, ICML.
[52] Martin A. Riedmiller,et al. Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.
[53] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[54] Mohammad Norouzi,et al. Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.