Reinforcement learning with rare significant events: direct policy search vs. gradient policy search
暂无分享,去创建一个
[1] Jason D. Lee,et al. On the Power of Over-parametrization in Neural Networks with Quadratic Activation , 2018, ICML.
[2] Shie Mannor,et al. Reinforcement learning in the presence of rare events , 2008, ICML '08.
[3] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[4] Vivek S. Borkar,et al. A Simulation-Based Algorithm for Ergodic Control of Markov Chains Conditioned on Rare Events , 2006, J. Mach. Learn. Res..
[5] Nicolas Bredeche,et al. Policy Search with Rare Significant Events: Choosing the Right Partner to Cooperate with , 2021, ArXiv.
[6] Shimon Whiteson,et al. OFFER: Off-Environment Reinforcement Learning , 2017, AAAI.