Contextual Online Learning Selection of Finite State Machines for Mobile Robots
暂无分享,去创建一个
Pan Zhou | Gao Liang | Chuanzhe Cui | Pan Zhou | Gao Liang | Chuanzhen Cui
[1] Tanaka Fumihide,et al. Multitask Reinforcement Learning on the Distribution of MDPs , 2003 .
[2] Csaba Szepesvári,et al. –armed Bandits , 2022 .
[3] Rüdiger Dillmann,et al. Probabilistic MDP-behavior planning for cars , 2011, 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC).
[4] Gediminas Adomavicius,et al. Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.
[5] Hui Li,et al. Multi-task Reinforcement Learning in Partially Observable Stochastic Environments , 2009, J. Mach. Learn. Res..
[6] Emilio Frazzoli,et al. Intention-Aware Motion Planning , 2013, WAFR.
[7] Markus Maurer,et al. Probabilistic online POMDP decision making for lane changes in fully automated driving , 2013, 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013).
[8] Tiziano Villa,et al. Synthesis of Finite State Machines: Logic Optimization , 1997 .
[9] Martin Pál,et al. Contextual Multi-Armed Bandits , 2010, AISTATS.
[10] Martial Hebert,et al. Multi-armed recommendation bandits for selecting state machine policies for robotic systems , 2013, 2013 IEEE International Conference on Robotics and Automation.
[11] Mihaela van der Schaar,et al. Online Learning in Large-Scale Contextual Recommender Systems , 2016, IEEE Transactions on Services Computing.
[12] R. Dillmann,et al. Design of the planner of team AnnieWAY’s autonomous vehicle used in the DARPA Urban Challenge 2007 , 2008, 2008 IEEE Intelligent Vehicles Symposium.
[13] Rüdiger Dillmann,et al. Probabilistic decision-making under uncertainty for autonomous driving using continuous POMDPs , 2014, 17th International IEEE Conference on Intelligent Transportation Systems (ITSC).
[14] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[15] P. W. Jones,et al. Multi-armed Bandit Allocation Indices , 1989 .
[16] Edwin Olson,et al. Multipolicy decision-making for autonomous driving via changepoint-based behavior prediction: Theory and experiment , 2015, Autonomous Robots.
[17] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[18] Wei Chu,et al. Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.
[19] Leo Liberti,et al. Bidirectional A* Search for Time-Dependent Fast Paths , 2008, WEA.
[20] J. Bather,et al. Multi‐Armed Bandit Allocation Indices , 1990 .
[21] Emilio Frazzoli,et al. A Survey of Motion Planning and Control Techniques for Self-Driving Urban Vehicles , 2016, IEEE Transactions on Intelligent Vehicles.