Curious Replay for Model-based Adaptation
暂无分享,去创建一个
[1] D. Bendor,et al. The role of experience in prioritizing hippocampal replay , 2023, bioRxiv.
[2] Feryal M. P. Behbahani,et al. Human-Timescale Adaptation in an Open-Ended Task Space , 2023, ICML.
[3] Jonathan C. Balloch,et al. Neuro-Symbolic World Models for Adapting to Open World Novelty , 2023, AAMAS.
[4] Jimmy Ba,et al. Mastering Diverse Domains through World Models , 2023, ArXiv.
[5] Eloi Alonso,et al. Transformers are Sample Efficient World Models , 2022, ICLR.
[6] P. Abbeel,et al. DayDreamer: World Models for Physical Robot Learning , 2022, CoRL.
[7] R. Munos,et al. BYOL-Explore: Exploration by Bootstrapped Prediction , 2022, NeurIPS.
[8] Ian S. Fischer,et al. Deep Hierarchical Planning from Pixels , 2022, NeurIPS.
[9] H. V. Seijen,et al. Towards Evaluating Adaptivity of Model-Based Reinforcement Learning Methods , 2022, ICML.
[10] Jaesik Yoon,et al. TransDreamer: Reinforcement Learning with Transformer World Models , 2022, ArXiv.
[11] Tatsunori B. Hashimoto,et al. Extending the WILDS Benchmark for Unsupervised Adaptation , 2021, ICLR.
[12] Deepak Pathak,et al. Interesting Object, Curious Agent: Learning Task-Agnostic Exploration , 2021, NeurIPS.
[13] Pieter Abbeel,et al. URLB: Unsupervised Reinforcement Learning Benchmark , 2021, NeurIPS Datasets and Benchmarks.
[14] Ingook Jang,et al. DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations , 2021, ICML.
[15] Oleh Rybkin,et al. Discovering and Achieving Goals via World Models , 2021, NeurIPS.
[16] Danijar Hafner. Benchmarking the Spectrum of Agent Capabilities , 2021, ICLR.
[17] Jitendra Malik,et al. RMA: Rapid Motor Adaptation for Legged Robots , 2021, Robotics: Science and Systems.
[18] Yang Yu,et al. Regret Minimization Experience Replay in Off-Policy Reinforcement Learning , 2021, NeurIPS.
[19] Alessandro Lazaric,et al. Reinforcement Learning with Prototypical Representations , 2021, ICML.
[20] Rico Jonschkowski,et al. The Distracting Control Suite - A Challenging Benchmark for Reinforcement Learning from Pixels , 2021, ArXiv.
[21] Doina Precup,et al. Towards Continual Reinforcement Learning: A Review and Perspectives , 2020, J. Artif. Intell. Res..
[22] Michael L. Waskom,et al. Seaborn: Statistical Data Visualization , 2021, J. Open Source Softw..
[23] Karol Hausman,et al. A Geometric Perspective on Self-Supervised Policy Adaptation , 2020, ArXiv.
[24] Simone Calderara,et al. Rethinking Experience Replay: a Bag of Tricks for Continual Learning , 2020, 2020 25th International Conference on Pattern Recognition (ICPR).
[25] Mohammad Norouzi,et al. Mastering Atari with Discrete World Models , 2020, ICLR.
[26] Daniel Yamins,et al. Active World Model Learning with Progress Curiosity , 2020, ICML.
[27] Jinwoo Shin,et al. Learning to Sample with Local and Global Contexts in Experience Replay Buffer , 2020, ICLR.
[28] Alexei A. Efros,et al. Self-Supervised Policy Adaptation during Deployment , 2020, ICLR.
[29] Harm van Seijen,et al. The LoCA Regret: A Consistent Metric to Evaluate Model-Based Behavior in Reinforcement Learning , 2020, NeurIPS.
[30] Erin J. Talvitie,et al. Selective Dyna-style Planning Under Limited Model Capacity , 2020, ICML.
[31] Stefano Ermon,et al. Experience Replay with Likelihood-free Importance Weights , 2020, L4DC.
[32] Jaime Fern'andez del R'io,et al. Array programming with NumPy , 2020, Nature.
[33] Pieter Abbeel,et al. Planning to Explore via Self-Supervised World Models , 2020, ICML.
[34] R. Fergus,et al. Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels , 2020, ICLR.
[35] S. Levine,et al. Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning , 2020, CoRL.
[36] Peiquan Sun,et al. Attentive Experience Replay , 2020, AAAI.
[37] Sergey Levine,et al. DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction , 2020, NeurIPS.
[38] Chelsea Finn,et al. Rapidly Adaptable Legged Robots via Evolutionary Meta-Learning , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[39] Tim Rocktäschel,et al. RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments , 2020, ICLR.
[40] Amir-massoud Farahmand,et al. Frequency-based Search-control in Dyna , 2020, ICLR.
[41] Jimmy Ba,et al. Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.
[42] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[43] Sehoon Ha,et al. Learning Fast Adaptation With Meta Strategy Optimization , 2019, IEEE Robotics and Automation Letters.
[44] Marco Pavone,et al. Continuous Meta-Learning without Tasks , 2019, NeurIPS.
[45] Johannes L. Schönberger,et al. SciPy 1.0: fundamental algorithms for scientific computing in Python , 2019, Nature Methods.
[46] Daochen Zha,et al. Experience Replay Optimization , 2019, IJCAI.
[47] Martha White,et al. Meta-Learning Representations for Continual Learning , 2019, NeurIPS.
[48] Deepak Pathak,et al. Self-Supervised Exploration via Disagreement , 2019, ICML.
[49] Sergey Levine,et al. Model-Based Reinforcement Learning for Atari , 2019, ICLR.
[50] Sergey Levine,et al. Online Meta-Learning , 2019, ICML.
[51] Sergey Levine,et al. Deep Online Learning via Meta-Learning: Continual Adaptation for Model-Based RL , 2018, ICLR.
[52] David Rolnick,et al. Experience Replay for Continual Learning , 2018, NeurIPS.
[53] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[54] Katja Hofmann,et al. Fast Context Adaptation via Meta-Learning , 2018, ICML.
[55] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[56] J. Schmidhuber,et al. Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.
[57] Marlos C. Machado,et al. Count-Based Exploration with the Successor Representation , 2018, AAAI.
[58] Petros Koumoutsakos,et al. Remember and Forget for Experience Replay , 2018, ICML.
[59] Zeb Kurth-Nelson,et al. Been There, Done That: Meta-Learning with Episodic Recall , 2018, ICML.
[60] Daniel L. K. Yamins,et al. Learning to Play with Intrinsically-Motivated Self-Aware Agents , 2018, NeurIPS.
[61] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[62] Dan Horgan,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[63] P. Abbeel,et al. Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments , 2017, ICLR.
[64] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[65] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[66] Surya Ganguli,et al. Continual Learning Through Synaptic Intelligence , 2017, ICML.
[67] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[68] S. Shankar Sastry,et al. Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning , 2017, ArXiv.
[69] Andrei A. Rusu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[70] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.
[71] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[72] Jonathan P. How,et al. Quickest change detection approach to optimal control in Markov decision processes with model changes , 2016, 2017 American Control Conference (ACC).
[73] Derek Hoiem,et al. Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[74] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[75] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[76] Shakir Mohamed,et al. Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning , 2015, NIPS.
[77] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[78] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[79] Antoine Cully,et al. Robots that can adapt like animals , 2014, Nature.
[80] Doina Precup,et al. An information-theoretic approach to curiosity-driven reinforcement learning , 2012, Theory in Biosciences.
[81] Ryan P. Adams,et al. Bayesian Online Changepoint Detection , 2007, 0710.3742.
[82] P. Fearnhead,et al. On‐line inference for multiple changepoint problems , 2007 .
[83] John D. Hunter,et al. Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.
[84] Hod Lipson,et al. Resilient Machines Through Continuous Self-Modeling , 2006, Science.
[85] Paulo Martins Engel,et al. Dealing with non-stationary environments using context detection , 2006, ICML.
[86] Chrystopher L. Nehaniv,et al. All Else Being Equal Be Empowered , 2005, ECAL.
[87] Sebastian Thrun,et al. Lifelong Learning Algorithms , 1998, Learning to Learn.
[88] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[89] Jürgen Schmidhuber,et al. A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .
[90] Marc G. Bellemare,et al. Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier , 2023, ICLR.
[91] Jinwoo Shin,et al. Model-augmented Prioritized Experience Replay , 2022, ICLR.
[92] Brad Burega. Learning to Prioritize Planning Updates in Model-based Reinforcement Learning , 2022 .
[93] Chelsea Finn,et al. Deep Reinforcement Learning amidst Continual Structured Non-Stationarity , 2021, ICML.
[94] Sergey Levine,et al. SMiRL: Surprise Minimizing Reinforcement Learning in Unstable Environments , 2021, ICLR.
[95] Wenlong Fu,et al. Model-based reinforcement learning: A survey , 2018 .
[96] Wes McKinney,et al. Data Structures for Statistical Computing in Python , 2010, SciPy.
[97] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.