暂无分享,去创建一个
Doina Precup | Khimya Khetarpal | Irina Rish | Matthew Riemer | Doina Precup | M. Riemer | Khimya Khetarpal | I. Rish
[1] P. Samuelson. A Note on Measurement of Utility , 1937 .
[2] P. Randolph. Bayesian Decision Problems and Markov Chains , 1968 .
[3] Stephen Grossberg,et al. A massively parallel architecture for a self-organizing neural pattern recognition machine , 1988, Comput. Vis. Graph. Image Process..
[4] Harry Heft. Affordances and the Body: An Intentional Analysis of Gibson's Ecological Approach to Visual Perception , 1989 .
[5] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .
[6] Robert M. French,et al. Using Semi-Distributed Representations to Overcome Catastrophic Forgetting in Connectionist Networks , 1991 .
[7] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[8] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.
[9] Yoshua Bengio,et al. Learning a synaptic learning rule , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[10] Jürgen Schmidhuber,et al. Learning to Control Fast-Weight Memories: An Alternative to Dynamic Recurrent Networks , 1992, Neural Computation.
[11] Richard S. Sutton,et al. Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta , 1992, AAAI.
[12] Leslie Pack Kaelbling,et al. Learning to Achieve Goals , 1993, IJCAI.
[13] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[14] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.
[15] James L. McClelland,et al. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. , 1995, Psychological review.
[16] Anthony V. Robins,et al. Catastrophic Forgetting, Rehearsal and Pseudorehearsal , 1995, Connect. Sci..
[17] Sebastian Thrun,et al. Discovering Structure in Multiple Learning Tasks: The TC Algorithm , 1996, ICML.
[18] Corso Elvezia. A General Method for Incremental Self-improvement and Multi-agent Learning in Unrestricted Environments , 1996 .
[19] Anthony V. Robins,et al. Consolidation in Neural Networks and in the Sleeping Brain , 1996, Connect. Sci..
[20] Juergen Schmidhuber,et al. A General Method For Incremental Self-Improvement And Multi-Agent Learning In Unrestricted Environme , 1999 .
[21] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[22] Milos Hauskrecht,et al. Hierarchical Solution of Markov Decision Processes using Macro-actions , 1998, UAI.
[23] D. Wolpert,et al. Internal models in the cerebellum , 1998, Trends in Cognitive Sciences.
[24] Jürgen Schmidhuber,et al. Reinforcement Learning with Self-Modifying Policies , 1998, Learning to Learn.
[25] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[26] Konkoly Thege. Multi-criteria Reinforcement Learning , 1998 .
[27] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[28] N. Whitman. A bitter lesson. , 1999, Academic medicine : journal of the Association of American Medical Colleges.
[29] Marcus Frean,et al. Catastrophic forgetting in simple networks: an analysis of the pseudorehearsal solution. , 1999, Network.
[30] Hiroshi Imamizu,et al. Human cerebellar activity reflecting an acquired internal model of a new tool , 2000, Nature.
[31] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[32] David S. Touretzky,et al. Behavioral considerations suggest an average reward TD model of the dopamine system , 2000, Neurocomputing.
[33] Dit-Yan Yeung,et al. Hidden-Mode Markov Decision Processes for Nonstationary Sequential Decision Making , 2001, Sequence Learning.
[34] Mitsuo Kawato,et al. Multiple Model-Based Reinforcement Learning , 2002, Neural Computation.
[35] Andrew G. Barto,et al. Optimal learning: computational procedures for bayes-adaptive markov decision processes , 2002 .
[36] A. Chemero. An Outline of a Theory of Affordances , 2003, How Shall Affordances be Refined? Four Perspectives.
[37] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[38] Mark B. Ring. CHILD: A First Step Towards Continual Learning , 1997, Machine Learning.
[39] Satinder Singh. Transfer of learning by composing solutions of elemental sequential tasks , 2004, Machine Learning.
[40] Jürgen Schmidhuber,et al. Shifting Inductive Bias with Success-Story Algorithm, Adaptive Levin Search, and Incremental Self-Improvement , 1997, Machine Learning.
[41] Carol Rovane,et al. What is an Agent? , 2004, Synthese.
[42] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[43] Warren B. Powell,et al. Reinforcement Learning and Its Relationship to Supervised Learning , 2004 .
[44] Chrystopher L. Nehaniv,et al. Empowerment: a universal agent-centric measure of control , 2005, 2005 IEEE Congress on Evolutionary Computation.
[45] P. Dayan,et al. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.
[46] Ricardo Vilalta,et al. A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.
[47] P. Dayan,et al. Dopamine, uncertainty and TD learning , 2005, Behavioral and Brain Functions.
[48] Paulo Martins Engel,et al. Dealing with non-stationary environments using context detection , 2006, ICML.
[49] Rich Caruana,et al. Model compression , 2006, KDD '06.
[50] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.
[51] Yoshua Bengio,et al. On the Optimization of a Synaptic Learning Rule , 2007 .
[52] Alan Fern,et al. Multi-task reinforcement learning: a hierarchical Bayesian approach , 2007, ICML '07.
[53] 三嶋 博之. The theory of affordances , 2008 .
[54] Jürgen Schmidhuber,et al. Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes , 2008, ABiALS.
[55] M. Botvinick,et al. Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.
[56] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[57] Y. Niv. Reinforcement learning in the brain , 2009 .
[58] N. Daw,et al. Human Reinforcement Learning Subdivides Structured Action Spaces by Learning Effector-Specific Values , 2009, The Journal of Neuroscience.
[59] Richard L. Lewis,et al. Where Do Rewards Come From , 2009 .
[60] Pierre-Yves Oudeyer,et al. R-IAC: Robust Intrinsically Motivated Exploration and Active Learning , 2009, IEEE Transactions on Autonomous Mental Development.
[61] Hui Li,et al. Multi-task Reinforcement Learning in Partially Observable Stochastic Environments , 2009, J. Mach. Learn. Res..
[62] P. Dayan,et al. States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning , 2010, Neuron.
[63] David Hsu,et al. Planning under Uncertainty for Robotic Tasks with Mixed Observability , 2010, Int. J. Robotics Res..
[64] Jean-Marc Fellous,et al. Computational models of reinforcement learning: the role of dopamine as a reward signal , 2010, Cognitive Neurodynamics.
[65] U. Rieder,et al. Markov Decision Processes , 2010 .
[66] Estevam R. Hruschka,et al. Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.
[67] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[68] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[69] P. Dayan,et al. Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.
[70] Nathaniel D. Daw,et al. Environmental statistics and the trade-off between model-based and TD learning in humans , 2011, NIPS.
[71] Stephanie C. Y. Chan,et al. On the value of information and other rewards , 2011, Nature Neuroscience.
[72] Sean C. Duncan. Minecraft, beyond construction and survival , 2011 .
[73] M. Frank,et al. Mechanisms of hierarchical reinforcement learning in cortico-striatal circuits 2: evidence from fMRI. , 2012, Cerebral cortex.
[74] N. Daw,et al. The ubiquity of model-based reinforcement learning , 2012, Current Opinion in Neurobiology.
[75] M. Frank,et al. Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis. , 2012, Cerebral cortex.
[76] Benjamin Rosman,et al. A Multitask Representation Using Reusable Local Policy Templates , 2012, AAAI Spring Symposium: Designing Intelligent Robots.
[77] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[78] Anne G E Collins,et al. How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis , 2012, The European journal of neuroscience.
[79] Natalie M. Trumpp,et al. Embodiment theory and education: The foundations of cognition in perception and action , 2012, Trends in Neuroscience and Education.
[80] N. Daw,et al. Generalization of value in reinforcement learning by humans , 2012, The European journal of neuroscience.
[81] E. Bizzi,et al. A theory for how sensorimotor skills are learned and retained in noisy and nonstationary neural circuits , 2013, Proceedings of the National Academy of Sciences.
[82] Matthew Botvinick,et al. Divide and Conquer: Hierarchical Reinforcement Learning and Task Decomposition in Humans , 2013, Computational and Robotic Models of the Hierarchical Organization of Behavior.
[83] S. Gershman,et al. Moderate levels of activation lead to forgetting in the think/no-think paradigm , 2013, Neuropsychologia.
[84] F. Oliehoek,et al. Scalable Bayesian Reinforcement Learning for Multiagent POMDPs , 2013 .
[85] Jürgen Schmidhuber,et al. PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem , 2011, Front. Psychol..
[86] M. Frank,et al. Acute stress selectively reduces reward sensitivity , 2013, Front. Hum. Neurosci..
[87] Andrew G. Barto,et al. Intrinsic Motivation and Reinforcement Learning , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.
[88] Ari Weinstein,et al. Model-based hierarchical reinforcement learning and human action control , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.
[89] Paul Weng,et al. Solving Hidden-Semi-Markov-Mode Markov Decision Problems , 2014, SUM.
[90] Emmanuel Hadoux,et al. Sequential Decision-Making under Non-stationary Environments via Sequential Change-point Detection , 2014 .
[91] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[92] Eric Eaton,et al. Online Multi-Task Learning for Policy Gradient Methods , 2014, ICML.
[93] Alec Solway,et al. Optimal Behavioral Hierarchy , 2014, PLoS Comput. Biol..
[94] Jürgen Leitner,et al. Curiosity driven reinforcement learning for motion planning on humanoids , 2014, Front. Neurorobot..
[95] Naoyuki Kubota,et al. Reinforcement Learning in non-stationary environments: An intrinsically motivated stress based memory retrieval performance (SBMRP) model , 2014, 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).
[96] Lihong Li,et al. PAC-inspired Option Discovery in Lifelong Reinforcement Learning , 2014, ICML.
[97] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.
[98] Daniele Calandriello,et al. Sparse multi-task reinforcement learning , 2014, Intelligenza Artificiale.
[99] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[100] N. Daw,et al. Model-based learning protects against forming habits , 2015, Cognitive, Affective, & Behavioral Neuroscience.
[101] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[102] Y. Niv,et al. Discovering latent causes in reinforcement learning , 2015, Current Opinion in Behavioral Sciences.
[103] Robert C. Wilson,et al. Reinforcement Learning in Multidimensional Environments Relies on Attention Mechanisms , 2015, The Journal of Neuroscience.
[104] Shakir Mohamed,et al. Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning , 2015, NIPS.
[105] Jianfeng Gao,et al. Recurrent Reinforcement Learning: A Hybrid Approach , 2015, ArXiv.
[106] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[107] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[108] Yusen Zhan,et al. An exploration strategy for non-stationary opponents , 2016, Autonomous Agents and Multi-Agent Systems.
[109] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.
[110] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[111] Stephen Clark,et al. Virtual Embodiment: A Scalable Long-Term Strategy for Artificial Intelligence Research , 2016, ArXiv.
[112] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[113] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[114] Finale Doshi-Velez,et al. Hidden Parameter Markov Decision Processes: A Semiparametric Regression Approach for Discovering Latent Task Parametrizations , 2013, IJCAI.
[115] Massimiliano Pontil,et al. The Benefit of Multitask Representation Learning , 2015, J. Mach. Learn. Res..
[116] Wojciech Jaskowski,et al. ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).
[117] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[118] John Shawe-Taylor,et al. Learning Shared Representations in Multi-task Reinforcement Learning , 2016, ArXiv.
[119] Shie Mannor,et al. Adaptive Skills Adaptive Partitions (ASAP) , 2016, NIPS.
[120] G. Pezzulo,et al. Navigating the Affordance Landscape: Feedback Control as a Process Model of Behavior and Cognition , 2016, Trends in Cognitive Sciences.
[121] Pieter Abbeel,et al. Meta-Learning with Temporal Convolutions , 2017, ArXiv.
[122] Marc'Aurelio Ranzato,et al. Gradient Episodic Memory for Continual Learning , 2017, NIPS.
[123] Siobhán Clarke,et al. Prediction-Based Multi-Agent Reinforcement Learning in Inherently Non-Stationary Environments , 2017, ACM Trans. Auton. Adapt. Syst..
[124] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[125] S. Gershman,et al. Dopamine reward prediction errors reflect hidden state inference across time , 2017, Nature Neuroscience.
[126] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[127] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[128] Honglak Lee,et al. Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning , 2017, ICML.
[129] Chrisantha Fernando,et al. PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.
[130] Saurabh Kumar,et al. Learning to Compose Skills , 2017, ArXiv.
[131] J. Tenenbaum,et al. Ingredients of intelligence: From classic debates to an engineering roadmap , 2017, Behavioral and Brain Sciences.
[132] Dan Klein,et al. Modular Multitask Reinforcement Learning with Policy Sketches , 2016, ICML.
[133] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.
[134] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[135] M. Littman,et al. Toward Good Abstractions for Lifelong Learning , 2017 .
[136] D. Hassabis,et al. Neuroscience-Inspired Artificial Intelligence , 2017, Neuron.
[137] Razvan Pascanu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.
[138] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[139] Marlos C. Machado,et al. A Laplacian Framework for Option Discovery in Reinforcement Learning , 2017, ICML.
[140] Sergey Levine,et al. Learning modular neural network policies for multi-task and multi-robot transfer , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[141] Nathaniel D. Daw,et al. Self-Evaluation of Decision-Making: A General Bayesian Framework for Metacognitive Computation , 2017, Psychological review.
[142] Li Zhang,et al. Learning to Learn: Meta-Critic Networks for Sample Efficient Learning , 2017, ArXiv.
[143] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.
[144] M. Riemer,et al. Representation Stability as a Regularizer for Improved Text Analytics Transfer Learning , 2017, arXiv.org.
[145] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[146] S. Gershman. Reinforcement Learning and Causal Models , 2017 .
[147] Trevor Darrell,et al. Loss is its own Reward: Self-Supervision for Reinforcement Learning , 2016, ICLR.
[148] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.
[149] Gregory Dudek,et al. Benchmark Environments for Multitask Learning in Continuous Domains , 2017, ArXiv.
[150] S. Shankar Sastry,et al. Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning , 2017, ArXiv.
[151] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[152] Guillaume Lample,et al. Playing FPS Games with Deep Reinforcement Learning , 2016, AAAI.
[153] M. Botvinick,et al. The successor representation in human reinforcement learning , 2016, Nature Human Behaviour.
[154] Michael R. Waldmann,et al. The Oxford handbook of causal reasoning , 2017 .
[155] Sebastian Risi,et al. Automated Curriculum Learning by Rewarding Temporally Rare Events , 2018, 2018 IEEE Conference on Computational Intelligence and Games (CIG).
[156] Richard Socher,et al. Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement Learning , 2017, ICLR.
[157] Leslie Pack Kaelbling,et al. Modular meta-learning , 2018, CoRL.
[158] Shimon Whiteson,et al. Learning with Opponent-Learning Awareness , 2017, AAMAS.
[159] Y. Niv,et al. Model-based predictions for dopamine , 2018, Current Opinion in Neurobiology.
[160] David Silver,et al. Meta-Gradient Reinforcement Learning , 2018, NeurIPS.
[161] Balaraman Ravindran,et al. Learning to Multi-Task by Active Sampling , 2017, ICLR.
[162] Michael L. Littman,et al. State Abstractions for Lifelong Reinforcement Learning , 2018, ICML.
[163] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[164] Satinder Singh,et al. On Learning Intrinsic Rewards for Policy Gradient Methods , 2018, NeurIPS.
[165] Satinder Singh,et al. Many-Goals Reinforcement Learning , 2018, ArXiv.
[166] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[167] Samy Bengio,et al. A Study on Overfitting in Deep Reinforcement Learning , 2018, ArXiv.
[168] Martha White,et al. The Barbados 2018 List of Open Issues in Continual Learning , 2018, ArXiv.
[169] Marcelo G Mattar,et al. Prioritized memory access explains planning and hippocampal replay , 2017, Nature Neuroscience.
[170] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[171] Murray Shanahan,et al. Continual Reinforcement Learning with Complex Synapses , 2018, ICML.
[172] Pieter Abbeel,et al. Meta Learning Shared Hierarchies , 2017, ICLR.
[173] Gerald Tesauro,et al. Learning Abstract Options , 2018, NeurIPS.
[174] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.
[175] Joel Z. Leibo,et al. Prefrontal cortex as a meta-reinforcement learning system , 2018, bioRxiv.
[176] Marlos C. Machado,et al. Generalization and Regularization in DQN , 2018, ArXiv.
[177] Ida Momennejad,et al. Offline replay supports planning in human reinforcement learning , 2018, eLife.
[178] Marlos C. Machado,et al. Eigenoption Discovery through the Deep Successor Representation , 2017, ICLR.
[179] Pieter Abbeel,et al. Variational Option Discovery Algorithms , 2018, ArXiv.
[180] Julian Togelius,et al. Illuminating Generalization in Deep Reinforcement Learning through Procedural Level Generation , 2018, 1806.10729.
[181] Marcus Hutter,et al. On Q-learning Convergence for Non-Markov Decision Processes , 2018, IJCAI.
[182] J. Schulman,et al. Reptile: a Scalable Metalearning Algorithm , 2018 .
[183] Joelle Pineau,et al. RE-EVALUATE: Reproducibility in Evaluating Reinforcement Learning Algorithms , 2018 .
[184] Zhanxing Zhu,et al. Reinforced Continual Learning , 2018, NeurIPS.
[185] Song-Chun Zhu,et al. Interactive Agent Modeling by Learning to Probe , 2018, ArXiv.
[186] Glen Berseth,et al. Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control , 2018, ICLR.
[187] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.
[188] Qiang Liu,et al. Learning to Explore with Meta-Policy Gradient , 2018, ICML 2018.
[189] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[190] Ilya Kostrikov,et al. Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play , 2017, ICLR.
[191] Yoshua Bengio,et al. Bayesian Model-Agnostic Meta-Learning , 2018, NeurIPS.
[192] Pieter Abbeel,et al. Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments , 2017, ICLR.
[193] David Isele,et al. Selective Experience Replay for Lifelong Learning , 2018, AAAI.
[194] Doina Precup,et al. Environments for Lifelong Reinforcement Learning , 2018, ArXiv.
[195] Derek Hoiem,et al. Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[196] Doina Precup,et al. When Waiting is not an Option : Learning Options with a Deliberation Cost , 2017, AAAI.
[197] Yee Whye Teh,et al. Progress & Compress: A scalable framework for continual learning , 2018, ICML.
[198] John Schulman,et al. Gotta Learn Fast: A New Benchmark for Generalization in RL , 2018, ArXiv.
[199] Pieter Abbeel,et al. Some Considerations on Learning to Explore via Meta-Reinforcement Learning , 2018, ICLR 2018.
[200] Martha White,et al. Organizing Experience: a Deeper Look at Replay Mechanisms for Sample-Based Planning in Continuous State Domains , 2018, IJCAI.
[201] Elliot Meyerson,et al. Evolutionary architecture search for deep multitask networks , 2018, GECCO.
[202] Shimon Whiteson,et al. DiCE: The Infinitely Differentiable Monte-Carlo Estimator , 2018, ICML.
[203] Elliot Meyerson,et al. Beyond Shared Hierarchies: Deep Multitask Learning through Soft Layer Ordering , 2017, ICLR.
[204] Pieter Abbeel,et al. Evolved Policy Gradients , 2018, NeurIPS.
[205] Dawn Xiaodong Song,et al. Assessing Generalization in Deep Reinforcement Learning , 2018, ArXiv.
[206] Tom Schaul,et al. Unicorn: Continual Learning with a Universal, Off-policy Agent , 2018, ArXiv.
[207] Theodore Lim,et al. SMASH: One-Shot Model Architecture Search through HyperNetworks , 2017, ICLR.
[208] Jonathan D. Cohen,et al. Efficiency of learning vs. processing: Towards a normative theory of multitasking , 2020, CogSci.
[209] J. Schmidhuber. Making the world differentiable: on using self supervised fully recurrent neural networks for dynamic reinforcement learning and planning in non-stationary environments , 1990, Forschungsberichte, TU Munich.
[210] Joelle Pineau,et al. Decoupling Dynamics and Reward for Transfer Learning , 2018, ICLR.
[211] S. Gershman,et al. Belief state representation in the dopamine system , 2018, Nature Communications.
[212] Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.
[213] Matthew Riemer,et al. Routing Networks: Adaptive Selection of Non-linear Functions for Multi-Task Learning , 2017, ICLR.
[214] Djallel Bouneffouf,et al. Scalable Recollections for Continual Lifelong Learning , 2017, AAAI.
[215] David Filliat,et al. DisCoRL: Continual Reinforcement Learning via Policy Distillation , 2019, ArXiv.
[216] S. Levine,et al. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.
[217] Neil D. Lawrence,et al. Transferring Knowledge across Learning Processes , 2018, ICLR.
[218] Yan Wu,et al. Optimizing agent behavior over long time scales by transporting value , 2018, Nature Communications.
[219] Kate Saenko,et al. Learning Multi-Level Hierarchies with Hindsight , 2017, ICLR.
[220] Filipe Wall Mutz,et al. Hindsight policy gradients , 2017, ICLR.
[221] Murray Shanahan,et al. Policy Consolidation for Continual Reinforcement Learning , 2019, ICML.
[222] Y. Niv. Learning task-state representations , 2019, Nature Neuroscience.
[223] Dong Yan,et al. Reward Shaping via Meta-Learning , 2019, ArXiv.
[224] Ramakanth Pasunuru,et al. Continual and Multi-Task Architecture Search , 2019, ACL.
[225] Katja Hofmann,et al. Fast Context Adaptation via Meta-Learning , 2018, ICML.
[226] Sergey Levine,et al. Deep Online Learning via Meta-Learning: Continual Adaptation for Model-Based RL , 2018, ICLR.
[227] Pierre-Yves Oudeyer,et al. CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning , 2018, ICML.
[228] Sergey Levine,et al. Online Meta-Learning , 2019, ICML.
[229] Christopher Potts,et al. Recursive Routing Networks: Learning to Compose Modules for Language Understanding , 2019, NAACL.
[230] Joelle Pineau,et al. Online Learned Continual Compression with Stacked Quantization Module , 2019, ArXiv.
[231] Sergey Levine,et al. Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables , 2019, ICML.
[232] Runhao Zeng,et al. Continual Reinforcement Learning with Diversity Exploration and Adversarial Self-Correction , 2019, ArXiv.
[233] Martha White,et al. Meta-Learning Representations for Continual Learning , 2019, NeurIPS.
[234] Bruno A. Olshausen,et al. Superposition of many models into one , 2019, NeurIPS.
[235] Marc'Aurelio Ranzato,et al. Efficient Lifelong Learning with A-GEM , 2018, ICLR.
[236] Philip S. Thomas,et al. Learning Action Representations for Reinforcement Learning , 2019, ICML.
[237] G. Spigler. Meta-learnt priors slow down catastrophic forgetting in neural networks , 2019, ArXiv.
[238] Doina Precup,et al. The Option Keyboard: Combining Skills in Reinforcement Learning , 2021, NeurIPS.
[239] Danesh Shahnazian,et al. Subgoal- and Goal-related Reward Prediction Errors in Medial Prefrontal Cortex , 2019, Journal of Cognitive Neuroscience.
[240] Thomas L. Griffiths,et al. Automatically Composing Representation Transformations as a Means for Generalization , 2018, ICLR.
[241] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[242] Michael R. Meager,et al. Hippocampal Contributions to Model-Based Planning and Spatial Memory , 2019, Neuron.
[243] Stefan Wermter,et al. Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.
[244] Nan Jiang,et al. On Value Functions and the Agent-Environment Boundary , 2019, ArXiv.
[245] Richard L. Lewis,et al. Discovery of Useful Questions as Auxiliary Tasks , 2019, NeurIPS.
[246] Erwan Lecarpentier,et al. Non-Stationary Markov Decision Processes a Worst-Case Approach using Model-Based Reinforcement Learning , 2019, NeurIPS.
[247] Wojciech Czarnecki,et al. Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.
[248] S. Levine,et al. Guided Meta-Policy Search , 2019, NeurIPS.
[249] Wojciech Jaskowski,et al. Model-Based Active Exploration , 2018, ICML.
[250] Patrick van der Smagt,et al. Unsupervised Real-Time Control Through Variational Empowerment , 2017, ISRR.
[251] Yee Whye Teh,et al. Exploiting Hierarchy for Learning and Transfer in KL-regularized RL , 2019, ArXiv.
[252] Quoc V. Le,et al. Diversity and Depth in Per-Example Routing Models , 2018, ICLR.
[253] Matthias De Lange,et al. Continual learning: A comparative study on how to defy forgetting in classification tasks , 2019, ArXiv.
[254] Yoshua Bengio,et al. Automated curriculum generation for Policy Gradients from Demonstrations , 2019, ArXiv.
[255] Lei Cao,et al. Learning to Learn: Hierarchical Meta-Critic Networks , 2019, IEEE Access.
[256] Yee Whye Teh,et al. Meta reinforcement learning as task inference , 2019, ArXiv.
[257] Atil Iscen,et al. NoRML: No-Reward Meta Learning , 2019, AAMAS.
[258] Yee Whye Teh,et al. Meta-learning of Sequential Strategies , 2019, ArXiv.
[259] Richard Socher,et al. Competitive Experience Replay , 2019, ICLR.
[260] Nicolas Le Roux,et al. A Geometric Perspective on Optimal Representations for Reinforcement Learning , 2019, NeurIPS.
[261] Automated curricula through setter-solver interactions , 2019, ArXiv.
[262] Ignacio Cases,et al. Routing Networks and the Challenges of Modular and Compositional Computation , 2019, ArXiv.
[263] Nan Jiang,et al. Provably efficient RL with Rich Observations via Latent State Decoding , 2019, ICML.
[264] Joelle Pineau,et al. Learning Causal State Representations of Partially Observable Environments , 2019, ArXiv.
[265] Tamim Asfour,et al. ProMP: Proximal Meta-Policy Search , 2018, ICLR.
[266] Rui Wang,et al. Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions , 2019, ArXiv.
[267] Kenneth O. Stanley,et al. Go-Explore: a New Approach for Hard-Exploration Problems , 2019, ArXiv.
[268] Gerald Tesauro,et al. Learning to Learn without Forgetting By Maximizing Transfer and Minimizing Interference , 2018, ICLR.
[269] Alexei A. Efros,et al. Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.
[270] Siyuan Li,et al. Context-Aware Policy Reuse , 2018, AAMAS.
[271] Joelle Pineau,et al. Combined Reinforcement Learning via Abstract Representations , 2018, AAAI.
[272] David Rolnick,et al. Experience Replay for Continual Learning , 2018, NeurIPS.
[273] Nicolas W. Schuck,et al. Sequential replay of nonspatial task states in the human hippocampus , 2018, Science.
[274] Sergey Levine,et al. Search on the Replay Buffer: Bridging Planning and Reinforcement Learning , 2019, NeurIPS.
[275] Falk Lieder,et al. Doing more with less: meta-reasoning and meta-learning in humans and machines , 2019, Current Opinion in Behavioral Sciences.
[276] Taehoon Kim,et al. Quantifying Generalization in Reinforcement Learning , 2018, ICML.
[277] Joel Z. Leibo,et al. Options as responses: Grounding behavioural hierarchies in multi-agent RL , 2019, ArXiv.
[278] Anthony I. Jang,et al. Positive reward prediction errors during decision making strengthen memory encoding , 2019, Nature Human Behaviour.
[279] Jordi Torres,et al. Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills , 2020, ICML.
[280] Yi Wu,et al. Multi-Task Reinforcement Learning with Soft Modularization , 2020, NeurIPS.
[281] Ruosong Wang,et al. Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning? , 2020, ICLR.
[282] Ali Farhadi,et al. Supermasks in Superposition , 2020, NeurIPS.
[283] David Vázquez,et al. Online Fast Adaptation and Knowledge Accumulation (OSAKA): a New Approach to Continual Learning , 2020, NeurIPS.
[284] Adam S. Lowet,et al. Distributional Reinforcement Learning in the Brain , 2020, Trends in Neurosciences.
[285] Rob Fergus,et al. Fast Adaptation via Policy-Dynamics Value Functions , 2020, ArXiv.
[286] Jared Kaplan,et al. A Neural Scaling Law from the Dimension of the Data Manifold , 2020, ArXiv.
[287] Andrei A. Rusu,et al. Embracing Change: Continual Learning in Deep Neural Networks , 2020, Trends in Cognitive Sciences.
[288] Tor Lattimore,et al. Behaviour Suite for Reinforcement Learning , 2019, ICLR.
[289] Luisa M. Zintgraf,et al. VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning , 2019, ICLR.
[290] Pieter Abbeel,et al. Planning to Explore via Self-Supervised World Models , 2020, ICML.
[291] Junhyuk Oh,et al. Discovering Reinforcement Learning Algorithms , 2020, NeurIPS.
[292] Mark Chen,et al. Scaling Laws for Autoregressive Generative Modeling , 2020, ArXiv.
[293] Alex Smola,et al. Meta-Q-Learning , 2019, ICLR.
[294] S. Levine,et al. Gradient Surgery for Multi-Task Learning , 2020, NeurIPS.
[295] Sridhar Mahadevan,et al. Optimizing for the Future in Non-Stationary MDPs , 2020, ICML.
[296] Joel Lehman,et al. Learning to Continually Learn , 2020, ECAI.
[297] Doina Precup,et al. Invariant Causal Prediction for Block MDPs , 2020, ICML.
[298] David Simchi-Levi,et al. Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism , 2020, ICML.
[299] S. Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[300] Pierre-Yves Oudeyer,et al. Automatic Curriculum Learning For Deep RL: A Short Survey , 2020, IJCAI.
[301] Doina Precup,et al. Options of Interest: Temporal Abstraction with Interest Functions , 2020, AAAI.
[302] Timothy M. Hospedales,et al. Online Meta-Critic Learning for Off-Policy Actor-Critic Methods , 2020, NeurIPS.
[303] Guangwen Yang,et al. Model-based Adversarial Meta-Reinforcement Learning , 2020, NeurIPS.
[304] Tom Mitchell,et al. Jelly Bean World: A Testbed for Never-Ending Learning , 2020, ICLR.
[305] Joelle Pineau,et al. Interference and Generalization in Temporal Difference Learning , 2020, ICML.
[306] Felipe Petroski Such,et al. Generalized Hidden Parameter MDPs Transferable Model-based RL in a Handful of Trials , 2020, AAAI.
[307] Richard J. Duro,et al. DREAM Architecture: a Developmental Approach to Open-Ended Learning in Robotics , 2020, ArXiv.
[308] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[309] G. Tesauro,et al. On the Role of Weight Sharing During Deep Option Learning , 2019, AAAI.
[310] Junhyuk Oh,et al. Meta-Gradient Reinforcement Learning with an Objective Discovered Online , 2020, NeurIPS.
[311] Pascal Vincent,et al. Efficient Learning in Non-Stationary Linear Markov Decision Processes , 2020, ArXiv.
[312] Doina Precup,et al. Value Preserving State-Action Abstractions , 2020, AISTATS.
[313] Pieter Abbeel,et al. Generalized Hindsight for Reinforcement Learning , 2020, NeurIPS.
[314] Ida Momennejad. Learning Structures: Predictive Representations, Replay, and Generalization , 2020, Current Opinion in Behavioral Sciences.
[315] Doina Precup,et al. What can I do here? A Theory of Affordances in Reinforcement Learning , 2020, ICML.
[316] J. Schulman,et al. Leveraging Procedural Generation to Benchmark Reinforcement Learning , 2019, ICML.
[317] Joel Lehman,et al. Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions , 2020, ICML.
[318] Scott M. Jordan,et al. Towards Safe Policy Improvement for Non-Stationary MDPs , 2020, NeurIPS.
[319] Junhyuk Oh,et al. What Can Learned Intrinsic Rewards Capture? , 2019, ICML.
[320] Min Lin,et al. Online Fast Adaptation and Knowledge Accumulation: a New Approach to Continual Learning , 2020, ArXiv.
[321] Chelsea Finn,et al. Deep Reinforcement Learning amidst Lifelong Non-Stationarity , 2020, ArXiv.
[322] Daniel Guo,et al. Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning , 2020, ICML.
[323] Jacob Andreas,et al. Experience Grounds Language , 2020, EMNLP.
[324] Tom Schaul,et al. Policy Evaluation Networks , 2020, ArXiv.
[325] Shimon Whiteson,et al. Multitask Soft Option Learning , 2019, UAI.
[326] Alec Radford,et al. Scaling Laws for Neural Language Models , 2020, ArXiv.
[327] Andrea Bonarini,et al. Sharing Knowledge in Multi-Task Deep Reinforcement Learning , 2020, ICLR.
[328] Junhyuk Oh,et al. Self-Tuning Deep Reinforcement Learning , 2020, ArXiv.
[329] Timothy M. Hospedales,et al. Meta-Learning in Neural Networks: A Survey , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[330] Yoshua Bengio,et al. CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning , 2020, ICLR.
[331] P. Abbeel,et al. Reset-Free Lifelong Learning with Skill-Space Planning , 2020, ICLR.
[332] Samuel D McDougle,et al. The role of executive function in shaping reinforcement learning , 2021, Current Opinion in Behavioral Sciences.
[333] Tinne Tuytelaars,et al. A Continual Learning Survey: Defying Forgetting in Classification Tasks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[334] Brendan McCane,et al. Pseudo-Rehearsal: Achieving Deep Reinforcement Learning without Catastrophic Forgetting , 2018, Neurocomputing.
[335] Gerald Tesauro,et al. A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning , 2020, ICML.
[336] E. Kaufmann,et al. A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces , 2020, AISTATS.
[337] Sara Hooker,et al. The hardware lottery , 2020, Commun. ACM.