Reinforcement Learning with Convex Constraints
暂无分享,去创建一个
Miroslav Dudík | Robert E. Schapire | Hal Daumé | Kianté Brantley | Sobhan Miryoosefi | R. Schapire | Miroslav Dudík | Hal Daumé | Kianté Brantley | Sobhan Miryoosefi
[1] Sham M. Kakade,et al. Provably Efficient Maximum Entropy Exploration , 2018, ICML.
[2] M. Sion. On general minimax theorems , 1958 .
[3] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.
[4] J. Neumann. Zur Theorie der Gesellschaftsspiele , 1928 .
[5] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[6] Yisong Yue,et al. Batch Policy Learning under Constraints , 2019, ICML.
[7] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[8] R. Dykstra,et al. A Method for Finding Projections onto the Intersection of Convex Sets in Hilbert Spaces , 1986 .
[9] E. Altman. Constrained Markov Decision Processes , 1999 .
[10] Robert E. Schapire,et al. A Game-Theoretic Approach to Apprenticeship Learning , 2007, NIPS.
[11] D. Blackwell. An analog of the minimax theorem for vector payoffs. , 1956 .
[12] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[13] J. M. Ingram,et al. Projections onto convex cones in Hilbert space , 1991 .
[14] Peter L. Bartlett,et al. Blackwell Approachability and No-Regret Learning are Equivalent , 2010, COLT.
[15] Y. Freund,et al. Adaptive game playing using multiplicative weights , 1999 .
[16] Shie Mannor,et al. Reward Constrained Policy Optimization , 2018, ICLR.