Dealing with multiple experts and non-stationarity in inverse reinforcement learning: an application to real-life problems
暂无分享,去创建一个
Marcello Restelli | Matteo Giuliani | Andrea Tirinzoni | Alberto Maria Metelli | Giorgia Ramponi | Amarildo Likmeta | Marcello Restelli | Giorgia Ramponi | Amarildo Likmeta | Andrea Tirinzoni | M. Giuliani
[1] Olivier Ledoit,et al. A well-conditioned estimator for large-dimensional covariance matrices , 2004 .
[2] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[3] Alessandro Lazaric,et al. Fighting Boredom in Recommender Systems with Linear Reinforcement Learning , 2018, NeurIPS.
[4] Pieter Abbeel,et al. An Algorithmic Perspective on Imitation Learning , 2018, Found. Trends Robotics.
[5] John G. Breslin,et al. Inferring user interests in microblogging social networks: a survey , 2018, User Modeling and User-Adapted Interaction.
[6] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[7] Peter Englert,et al. Probabilistic model-based imitation learning , 2013, Adapt. Behav..
[8] Hai Jin,et al. Graph Processing on GPUs , 2018, ACM Comput. Surv..
[9] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[10] Marcello Restelli,et al. Inverse Reinforcement Learning through Policy Gradient Minimization , 2016, AAAI.
[11] A. Castelletti,et al. A coupled human‐natural systems analysis of irrigated agriculture under changing climate , 2016 .
[12] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.
[13] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[14] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[15] Ken Goldberg,et al. Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation , 2017, ICRA.
[16] Marcello Restelli,et al. Truly Batch Model-Free Inverse Reinforcement Learning about Multiple Intentions , 2020, AISTATS.
[17] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[18] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[19] Satish V. Ukkusuri,et al. Joint inference of user community and interest patterns in social interaction networks , 2017, Social Network Analysis and Mining.
[20] Alexey Dosovitskiy,et al. End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[21] Michael Kearns,et al. Reinforcement learning for optimized trade execution , 2006, ICML.
[22] Qing Yang,et al. Discovering User Interest on Twitter with a Modified Author-Topic Model , 2011, 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.
[23] Andrea Castelletti,et al. Identifying and Modeling Dynamic Preference Evolution in Multipurpose Water Resources Systems , 2018 .
[24] Jan Peters,et al. Policy Search for Motor Primitives in Robotics , 2008, NIPS 2008.
[25] Sanmay Das,et al. The effects of feedback on human behavior in social media: an inverse reinforcement learning model , 2014, AAMAS.
[26] Yi-Shin Chen,et al. A Dynamic Influence Keyword Model for Identifying Implicit User Interests on Social Networks , 2017, ASONAM.
[27] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[28] Michael A. H. Dempster,et al. Intraday FX Trading: An Evolutionary Reinforcement Learning Approach , 2002, IDEAL.
[29] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[30] Nicolas Vayatis,et al. A review of change point detection methods , 2018, ArXiv.
[31] Mohamed Medhat Gaber,et al. Imitation Learning , 2017, ACM Comput. Surv..
[32] Matthieu Geist,et al. A Cascaded Supervised Learning Approach to Inverse Reinforcement Learning , 2013, ECML/PKDD.
[33] Dushyant Rao,et al. Large-scale cost function learning for path planning using deep inverse reinforcement learning , 2017, Int. J. Robotics Res..
[34] Luis Montesano,et al. Learning multiple behaviours using hierarchical clustering of rewards , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[35] J. Teichmann,et al. Deep hedging , 2019, Quantitative Finance.
[36] G. Casella,et al. Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.
[37] Pieter Abbeel,et al. Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..
[38] Luming Zhang,et al. Interest Inference via Structure-Constrained Multi-Source Multi-Task Learning , 2015, IJCAI.
[39] Jan Peters,et al. Relative Entropy Inverse Reinforcement Learning , 2011, AISTATS.
[40] Martial Hebert,et al. Learning monocular reactive UAV control in cluttered natural environments , 2012, 2013 IEEE International Conference on Robotics and Automation.
[41] Guy Shani,et al. An MDP-Based Recommender System , 2002, J. Mach. Learn. Res..
[42] Marcello Restelli,et al. Compatible Reward Inverse Reinforcement Learning , 2017, NIPS.
[43] Diane J. Cook,et al. A survey of methods for time series change point detection , 2017, Knowledge and Information Systems.
[44] Marcello Restelli,et al. Smoothing policies and safe policy gradients , 2019, Machine Learning.
[45] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[46] Michael L. Littman,et al. Apprenticeship Learning About Multiple Intentions , 2011, ICML.
[47] Andrea Bonarini,et al. Gradient-based minimization for multi-expert Inverse Reinforcement Learning , 2017, 2017 IEEE Symposium Series on Computational Intelligence (SSCI).
[48] Wolfram Burgard,et al. Learning driving styles for autonomous vehicles from demonstration , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[49] David Silver,et al. Learning Autonomous Driving Styles and Maneuvers from Expert Demonstration , 2012, ISER.
[50] Richard Bellman,et al. ON A ROUTING PROBLEM , 1958 .
[51] Matthieu Geist,et al. Inverse Reinforcement Learning through Structured Classification , 2012, NIPS.
[52] Stephen J. Wright,et al. Numerical Optimization , 2018, Fundamental Statistical Inference.
[53] A. Castelletti,et al. Detecting the State of the Climate System via Artificial Intelligence to Improve Seasonal Forecasts and Inform Reservoir Operations , 2019, Water Resources Research.
[54] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[55] Daniel Krajzewicz,et al. Recent Development and Applications of SUMO - Simulation of Urban MObility , 2012 .
[56] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[57] Prashant Doshi,et al. Multi-robot inverse reinforcement learning under occlusion with interactions , 2014, AAMAS.
[58] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[59] Marcello Restelli,et al. Combining reinforcement learning with rule-based controllers for transparent and general decision-making in autonomous driving , 2020, Robotics Auton. Syst..
[60] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .