Apprenticeship Learning via Frank-Wolfe
暂无分享,去创建一个
Haim Kaplan | Yishay Mansour | Alon Cohen | Tom Zahavy | Y. Mansour | Haim Kaplan | Alon Cohen | Tom Zahavy
[1] Yinyu Ye,et al. The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate , 2011, Math. Oper. Res..
[2] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[3] Michael H. Bowling,et al. Apprenticeship learning using linear programming , 2008, ICML '08.
[4] Elad Hazan,et al. Faster Rates for the Frank-Wolfe Method over Strongly-Convex Sets , 2014, ICML.
[5] Haipeng Luo,et al. Variance-Reduced and Projection-Free Stochastic Optimization , 2016, ICML.
[6] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.
[7] Marc Teboulle,et al. A conditional gradient method with linear rate of convergence for solving convex linear systems , 2004, Math. Methods Oper. Res..
[8] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[9] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[10] DAN GARBER,et al. A Linearly Convergent Variant of the Conditional Gradient Algorithm under Strong Convexity, with Applications to Online and Stochastic Optimization , 2016, SIAM J. Optim..
[11] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[12] Martin Jaggi,et al. An Affine Invariant Linear Convergence Analysis for Frank-Wolfe Algorithms , 2013, 1312.7864.
[13] Elad Hazan,et al. A Linearly Convergent Conditional Gradient Algorithm with Applications to Online and Stochastic Optimization , 2013, 1301.4666.
[14] Haim Kaplan,et al. Average reward reinforcement learning with unknown mixing times , 2019, ArXiv.
[15] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[16] Shimrit Shtern,et al. Linearly convergent away-step conditional gradient for non-strongly convex functions , 2015, Math. Program..
[17] Boris Polyak,et al. Constrained minimization methods , 1966 .
[18] Robert E. Schapire,et al. A Game-Theoretic Approach to Apprenticeship Learning , 2007, NIPS.
[19] Yishay Mansour,et al. Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..
[20] Peter Bro Miltersen,et al. Strategy Iteration Is Strongly Polynomial for 2-Player Turn-Based Stochastic Games with a Constant Discount Factor , 2010, JACM.
[21] Elad Hazan,et al. Projection-free Online Learning , 2012, ICML.
[22] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .
[23] Javier Peña,et al. Polytope Conditioning and Linear Convergence of the Frank-Wolfe Algorithm , 2015, Math. Oper. Res..
[24] Philip Wolfe,et al. An algorithm for quadratic programming , 1956 .
[25] Haim Kaplan,et al. Unknown mixing times in apprenticeship and reinforcement learning , 2020, UAI.
[26] Patrice Marcotte,et al. Some comments on Wolfe's ‘away step’ , 1986, Math. Program..
[27] Martin Jaggi,et al. Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.
[28] M. Canon,et al. A Tight Upper Bound on the Rate of Convergence of Frank-Wolfe Algorithm , 1968 .