On The Effect of Auxiliary Tasks on Representation Dynamics
暂无分享,去创建一个
Clare Lyle | Georg Ostrovski | Mark Rowland | Will Dabney | Georg Ostrovski | Mark Rowland | Will Dabney | Clare Lyle | M. Rowland
[1] Yoshua Bengio,et al. Hyperbolic Discounting and Learning over Multiple Horizons , 2019, ArXiv.
[2] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[3] Mohammad Norouzi,et al. An Optimistic Perspective on Offline Reinforcement Learning , 2020, ICML.
[4] Marc G. Bellemare,et al. Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.
[5] Shie Mannor,et al. Adaptive Bases for Reinforcement Learning , 2010, ECML/PKDD.
[6] Shie Mannor,et al. Shallow Updates for Deep Reinforcement Learning , 2017, NIPS.
[7] Shimon Whiteson,et al. Automatic Feature Selection for Model-Based Reinforcement Learning in Factored MDPs , 2009, 2009 International Conference on Machine Learning and Applications.
[8] Marlos C. Machado,et al. A Laplacian Framework for Option Discovery in Reinforcement Learning , 2017, ICML.
[9] Lihong Li,et al. Analyzing feature generation for value-function approximation , 2007, ICML '07.
[10] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[11] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[12] Pieter Abbeel,et al. Decoupling Representation Learning from Reinforcement Learning , 2020, ICML.
[13] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[14] Marc G. Bellemare,et al. A Comparative Analysis of Expected and Distributional Reinforcement Learning , 2019, AAAI.
[15] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[16] Marek Petrik,et al. An Analysis of Laplacian Methods for Value Function Approximation in MDPs , 2007, IJCAI.
[17] Dimitri P. Bertsekas,et al. Feature-based aggregation and deep reinforcement learning: a survey and some new implementations , 2018, IEEE/CAA Journal of Automatica Sinica.
[18] Martha White,et al. Two-Timescale Networks for Nonlinear Value Function Approximation , 2019, ICLR.
[19] Vivek S. Borkar,et al. An actor critic algorithm based on Grassmanian search , 2014, 53rd IEEE Conference on Decision and Control.
[20] Charl J. Ras,et al. Unsupervised Basis Function Adaptation for Reinforcement Learning , 2017, J. Mach. Learn. Res..
[21] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[22] Dimitri P. Bertsekas,et al. Basis function adaptation methods for cost approximation in MDP , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[23] Matthew E. Taylor,et al. Terminal Prediction as an Auxiliary Task for Deep Reinforcement Learning , 2019, AIIDE.
[24] Vivek S. Borkar,et al. Actor-Critic Algorithms with Online Feature Adaptation , 2016, ACM Trans. Model. Comput. Simul..
[25] Lek-Heng Lim,et al. Schubert Varieties and Distances between Subspaces of Different Dimensions , 2014, SIAM J. Matrix Anal. Appl..
[26] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.
[27] Sridhar Mahadevan,et al. Learning Representation and Control in Markov Decision Processes: New Frontiers , 2009, Found. Trends Mach. Learn..
[28] Vivek S. Borkar,et al. Feature Search in the Grassmanian in Online Reinforcement Learning , 2013, IEEE Journal of Selected Topics in Signal Processing.
[29] Daniel Guo,et al. Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning , 2020, ICML.
[30] Pieter Abbeel,et al. CURL: Contrastive Unsupervised Representations for Reinforcement Learning , 2020, ICML.
[31] Ruosong Wang,et al. Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning? , 2020, ICLR.
[32] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[33] Shie Mannor,et al. Basis Function Adaptation in Temporal Difference Reinforcement Learning , 2005, Ann. Oper. Res..
[34] Nicolas Le Roux,et al. A Geometric Perspective on Optimal Representations for Reinforcement Learning , 2019, NeurIPS.
[35] Marc G. Bellemare,et al. Representations for Stable Off-Policy Reinforcement Learning , 2020, ICML.
[36] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[37] Marc G. Bellemare,et al. The Value-Improvement Path: Towards Better Representations for Reinforcement Learning , 2020, AAAI.
[38] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.
[39] Joel Nothman,et al. SciPy 1.0-Fundamental Algorithms for Scientific Computing in Python , 2019, ArXiv.
[40] Marc G. Bellemare,et al. DeepMDP: Learning Continuous Latent Space Models for Representation Learning , 2019, ICML.
[41] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[42] Tor Lattimore,et al. Learning with Good Feature Representations in Bandits and in RL with a Generative Model , 2020, ICML.
[43] Bahram Behzadian,et al. Feature Selection by Singular Value Decomposition for Reinforcement Learning , 2019 .
[44] Sergey Levine,et al. Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning , 2020, ICLR.
[45] Albin Cassirer,et al. Randomized Prior Functions for Deep Reinforcement Learning , 2018, NeurIPS.
[46] Nan Jiang,et al. Abstraction Selection in Model-based Reinforcement Learning , 2015, ICML.
[47] Razvan Pascanu,et al. Adapting Auxiliary Losses Using Gradient Similarity , 2018, ArXiv.
[48] Marlos C. Machado,et al. Eigenoption Discovery through the Deep Successor Representation , 2017, ICLR.
[49] Samuel Gershman,et al. Design Principles of the Hippocampal Cognitive Map , 2014, NIPS.
[50] Hossein Mobahi,et al. Self-Distillation Amplifies Regularization in Hilbert Space , 2020, NeurIPS.
[51] Yee Whye Teh,et al. An Analysis of Categorical Distributional Reinforcement Learning , 2018, AISTATS.
[52] Ruosong Wang,et al. Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks , 2019, ICML.
[53] R. Sutton,et al. A convergent O ( n ) algorithm for off-policy temporal-difference learning with linear function approximation , 2008, NIPS 2008.
[54] David Held,et al. Adaptive Auxiliary Task Weighting for Reinforcement Learning , 2019, NeurIPS.
[55] Nahum Shimkin,et al. Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning , 2016, ICML.
[56] Sanjeev Arora,et al. Implicit Regularization in Deep Matrix Factorization , 2019, NeurIPS.
[57] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[58] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[59] Michal Valko,et al. Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.
[60] Lihong Li,et al. An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning , 2008, ICML '08.
[61] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.
[62] Hongyang Zhang,et al. Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations , 2017, COLT.
[63] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.