暂无分享,去创建一个
Li Zhao | Jian Shen | Tie-Yan Liu | Zhengyu Yang | Weinan Zhang | Minghuan Liu | Hanye Zhao | Weinan Zhang | Tie-Yan Liu | Minghuan Liu | Jian Shen | Hanye Zhao | Zhengyu Yang | Li Zhao
[1] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[2] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[3] Sergey Levine,et al. Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning , 2019, ArXiv.
[4] Bo Dai,et al. GenDICE: Generalized Offline Estimation of Stationary Values , 2020, ICLR.
[5] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[6] S. Levine,et al. Conservative Q-Learning for Offline Reinforcement Learning , 2020, NeurIPS.
[7] Thorsten Joachims,et al. MOReL : Model-Based Offline Reinforcement Learning , 2020, NeurIPS.
[8] Sergey Levine,et al. A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models , 2016, ArXiv.
[9] Yang Yu,et al. On Value Discrepancy of Imitation Learning , 2019, ArXiv.
[10] Lantao Yu,et al. MOPO: Model-based Offline Policy Optimization , 2020, NeurIPS.
[11] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[12] Bo Dai,et al. DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections , 2019, NeurIPS.
[13] Sergey Levine,et al. Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction , 2019, NeurIPS.
[14] Che Wang,et al. BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning , 2019, NeurIPS.
[15] Sergey Levine,et al. D4RL: Datasets for Deep Data-Driven Reinforcement Learning , 2020, ArXiv.
[16] J. Andrew Bagnell,et al. Efficient Reductions for Imitation Learning , 2010, AISTATS.
[17] Claude Sammut,et al. A Framework for Behavioural Cloning , 1995, Machine Intelligence 15.
[18] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[19] Leland McInnes,et al. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.
[20] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[21] Sergey Levine,et al. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.
[22] Zhen Xu,et al. NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning , 2021, NeurIPS.
[23] Martin A. Riedmiller,et al. Keep Doing What Worked: Behavioral Modelling Priors for Offline Reinforcement Learning , 2020, ICLR.
[24] Qing Wang,et al. Exponentially Weighted Imitation Learning for Batched Historical Data , 2018, NeurIPS.
[25] M. Rosenblatt. Remarks on Some Nonparametric Estimates of a Density Function , 1956 .
[26] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[27] Richard Zemel,et al. A Divergence Minimization Perspective on Imitation Learning Methods , 2019, CoRL.