Inverse Constrained Reinforcement Learning
暂无分享,去创建一个
[1] S. Levine,et al. Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers , 2020, ICLR.
[2] Dmitry Berenson,et al. Learning Constraints From Locally-Optimal Demonstrations Under Cost Function Uncertainty , 2020, IEEE Robotics and Automation Letters.
[3] S. Sastry,et al. Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning , 2019, ICLR.
[4] Miroslav Dudík,et al. Reinforcement Learning with Convex Constraints , 2019, NeurIPS.
[5] Yisong Yue,et al. Batch Policy Learning under Constraints , 2019, ICML.
[6] P. Abbeel,et al. Preferences Implicit in the State of the World , 2019, ICLR.
[7] Kee-Eung Kim,et al. A Bayesian Approach to Generative Adversarial Imitation Learning , 2018, NeurIPS.
[8] Shane Legg,et al. Scalable agent alignment via reward modeling: a research direction , 2018, ArXiv.
[9] Michael Gleicher,et al. Inferring geometric constraints in human demonstrations , 2018, CoRL.
[10] Shie Mannor,et al. Reward Constrained Policy Optimization , 2018, ICLR.
[11] Ofir Nachum,et al. A Lyapunov-based Approach to Safe Reinforcement Learning , 2018, NeurIPS.
[12] Laurent Orseau,et al. AI Safety Gridworlds , 2017, ArXiv.
[13] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[14] Anca D. Dragan,et al. Active Preference-Based Learning of Reward Functions , 2017, Robotics: Science and Systems.
[15] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[16] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.
[17] Julie A. Shah,et al. C-LEARN: Learning geometric constraints from demonstrations for multi-step manipulation in shared autonomy , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[18] Guan Wang,et al. Interactive Learning from Policy-Dependent Human Feedback , 2017, ICML.
[19] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[20] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[21] J. Schulman,et al. OpenAI Gym , 2016, ArXiv.
[22] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[23] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[24] Oliver Kroemer,et al. Active Reward Learning , 2014, Robotics: Science and Systems.
[25] Jonathan P. How,et al. Bayesian Nonparametric Inverse Reinforcement Learning , 2012, ECML/PKDD.
[26] Shalabh Bhatnagar,et al. An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes , 2010, Syst. Control. Lett..
[27] Brian D. Ziebart,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[28] Eyal Amir,et al. Bayesian Inverse Reinforcement Learning , 2007, IJCAI.
[29] Rüdiger Dillmann,et al. Learning sequential constraints of tasks from user demonstrations , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..
[30] E. Altman. Constrained Markov Decision Processes , 1999 .
[31] Inverse Constrained Reinforcement Learning , 2021 .
[32] Hongxia Jin,et al. Text-Based Interactive Recommendation via Constraint-Augmented Reinforcement Learning , 2019, NeurIPS.
[33] Dario Amodei,et al. Benchmarking Safe Exploration in Deep Reinforcement Learning , 2019 .