Contextual Sequence Prediction with Application to Control Library Optimization

Sequence optimization, where the items in a list are ordered to maximize some reward has many applications such as web advertisement placement, search, and control libraries in robotics. Previous work in sequence optimization produces a static ordering that does not take any features of the item or context of the problem into account. In this work, we propose a general approach to order the items within the sequence based on the context (e.g., perceptual information, environment description, and goals). We take a simple, efficient, reduction-based approach where the choice and order of the items is established by repeatedly learning simple classifiers or regressors for each “slot” in the sequence. Our approach leverages recent work on submodular function maximization to provide a formal regret reduction from submodular sequence optimization to simple costsensitive prediction. We apply our contextual sequence prediction algorithm to optimize control libraries and demonstrate results on two robotics problems: manipulator trajectory prediction and mobile robot path planning.

[1]  Antonio Criminisi,et al.  Object categorization by learned universal visual dictionary , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[2]  E. Feron,et al.  Robust hybrid control for autonomous vehicle motion planning , 2000, Proceedings of the 39th IEEE Conference on Decision and Control (Cat. No.00CH37187).

[3]  John Langford,et al.  Sensitive Error Correcting Output Codes , 2005, COLT.

[4]  Siddhartha S. Srinivasa,et al.  Imitation learning for locomotion and manipulation , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.

[5]  Sebastian Thrun,et al.  Junior: The Stanford entry in the Urban Challenge , 2008, J. Field Robotics.

[6]  Matthew Zucker A Data-Driven Approach to High Level Planning , 2009 .

[7]  Siddhartha S. Srinivasa,et al.  CHOMP: Gradient optimization techniques for efficient motion planning , 2009, 2009 IEEE International Conference on Robotics and Automation.

[8]  William Whittaker,et al.  Autonomous driving in urban environments: Boss and the Urban Challenge , 2008, J. Field Robotics.

[9]  William Whittaker,et al.  A robust approach to high‐speed navigation for unrehearsed desert terrain , 2007 .

[10]  Alonzo Kelly,et al.  Optimal Sampling In the Space of Paths: Preliminary Results , 2006 .

[11]  Siddhartha S. Srinivasa,et al.  Learning from Experience in Manipulation Planning: Setting the Right Goals , 2011, ISRR.

[12]  J. Andrew Bagnell,et al.  Efficient Optimization of Control Libraries , 2011, AAAI.

[13]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[14]  Andreas Krause,et al.  Online Learning of Assignments , 2009, NIPS.

[15]  Larry D. Jackel,et al.  The DARPA LAGR program: Goals, challenges, methodology, and phase I results , 2006, J. Field Robotics.

[16]  Robert B. Fisher,et al.  Ranking planar grasp configurations for a three-finger hand , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[17]  Steven M. LaValle,et al.  Survivability: Measuring and ensuring path diversity , 2009, 2009 IEEE International Conference on Robotics and Automation.

[18]  Sebastian Thrun,et al.  Stanley: The robot that won the DARPA Grand Challenge: Research Articles , 2006 .

[19]  Dmitry Berenson,et al.  Grasp planning in complex scenes , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.

[20]  David M. Bradley,et al.  Learning for Autonomous Navigation , 2010, IEEE Robotics & Automation Magazine.

[21]  Filip Radlinski,et al.  Learning diverse rankings with multi-armed bandits , 2008, ICML '08.

[22]  Matthew J. Streeter,et al.  An Online Algorithm for Maximizing Submodular Functions , 2008, NIPS.

[23]  Alonzo Kelly,et al.  Toward Reliable Off Road Autonomous Vehicles Operating in Challenging Environments , 2006, Int. J. Robotics Res..

[24]  Marc Toussaint,et al.  Trajectory prediction: learning to map situations to robot trajectories , 2009, ICML '09.

[25]  Jan Peters,et al.  Learning table tennis with a Mixture of Motor Primitives , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[26]  Steven M. LaValle,et al.  RRT-connect: An efficient approach to single-query path planning , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[27]  Christopher G. Atkeson,et al.  Policies based on trajectory libraries , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[28]  Yisong Yue,et al.  Linear Submodular Bandits and their Application to Diversified Retrieval , 2011, NIPS.

[29]  Thomas P. Hayes,et al.  Error limiting reductions between classification tasks , 2005, ICML.