暂无分享,去创建一个
Shaobo Hou | Brendan O'Donoghue | Andre Barreto | Tom Zahavy | Iurii Kemaev | Daniel J Mankowitz | Satinder Baveja Singh | André Barreto | Satinder Singh | D. Mankowitz | Brendan O'Donoghue | Tom Zahavy | Shaobo Hou | Iurii Kemaev
[1] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[2] Patrice Marcotte,et al. Some comments on Wolfe's ‘away step’ , 1986, Math. Program..
[3] Haim Kaplan,et al. Planning in Hierarchical Reinforcement Learning: Guarantees for Using Local Policies , 2019, ALT.
[4] Shie Mannor,et al. Robustness and generalization , 2010, Machine Learning.
[5] Dana H. Ballard,et al. Multiple-Goal Reinforcement Learning with Modular Sarsa(0) , 2003, IJCAI.
[6] Doina Precup,et al. Fast reinforcement learning with generalized policy updates , 2020, Proceedings of the National Academy of Sciences.
[7] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.
[8] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[9] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[10] Elad Hazan,et al. A Linearly Convergent Conditional Gradient Algorithm with Applications to Online and Stochastic Optimization , 2013, 1301.4666.
[11] David Warde-Farley,et al. Fast Task Inference with Variational Intrinsic Successor Features , 2019, ICLR.
[12] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[13] Philip Wolfe,et al. An algorithm for quadratic programming , 1956 .
[14] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[15] Shie Mannor,et al. Robust Regression and Lasso , 2008, IEEE Transactions on Information Theory.
[16] Stuart J. Russell,et al. Q-Decomposition for Reinforcement Learning Agents , 2003, ICML.
[17] Haim Kaplan,et al. Average reward reinforcement learning with unknown mixing times , 2019, ArXiv.
[18] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[19] Tom Schaul,et al. Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement , 2018, ICML.
[20] Stephen P. Boyd,et al. CVXPY: A Python-Embedded Modeling Language for Convex Optimization , 2016, J. Mach. Learn. Res..
[21] Laurent El Ghaoui,et al. Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..
[22] Raia Hadsell,et al. Disentangled Cumulants Help Successor Representations Transfer to New Tasks , 2019, ArXiv.
[23] Laurent El Ghaoui,et al. Robust Solutions to Least-Squares Problems with Uncertain Data , 1997, SIAM J. Matrix Anal. Appl..
[24] Haim Kaplan,et al. Apprenticeship Learning via Frank-Wolfe , 2019, AAAI.
[25] Junhyuk Oh,et al. A Self-Tuning Actor-Critic Algorithm , 2020, NeurIPS.
[26] Martin Jaggi,et al. Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.
[27] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.