Learning situation-dependent costs: improving planning from probabilistic robot execution

Real world robot tasks are so complex that it is hard to hand-tune all of the domain knowledge, especially to model the dynamics of the environment. Several research efforts focufl on applying machine learning to map learning, sensor/action mapping, and vision. The work presented in this paper explores machine learning techniques for robot planning. The goal is to use real robotic navigational execution as a data source for learning. Our system collects execution traces, and extracts relevant information to improve the efficiency of generated plnns, In this article, we present the representation of the path plnnner nnd the navigation modules, and describe the execution trace. We show how training data is extracted from the execution trace, We introduce the concept of situation-dependent costs, where situational features can be attached to the costs used hy the p&h planner. In this way, the planner can generate pAdIf that are appropriate for a given situation. We present experimental results from a simulated, controlled environment as well as from data collected from the actual robot.

[1]  Cristina Baroglio,et al.  Learning Controllers for Industrial Robots , 2005, Machine Learning.

[2]  Xuemei Wang,et al.  Learning Planning Operators by Observation and Practice , 1994, AIPS.

[3]  Todd Michael Mansell,et al.  A method for Planning Given Uncertain and Incomplete Information , 1993, UAI.

[4]  Karen Zita Haigh,et al.  Planning, Execution and Learning in a Robotic Agent , 1998, AIPS.

[5]  Kristian J. Hammond Learning and Reusing Explanations , 1987 .

[6]  Reid G. Simmons,et al.  Structured control for autonomous robots , 1994, IEEE Trans. Robotics Autom..

[7]  Trevor Hastie,et al.  Statistical Models in S , 1991 .

[8]  Karen Zita Haigh,et al.  Interleaving planning and robot execution for asynchronous user requests , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[9]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[10]  Marshall McLuhan,et al.  New Languages , 2019, Sonic Writing.

[11]  David Kortenkamp,et al.  Topological Mapping for Mobile Robots Using a Combination of Sonar and Vision Sensing , 1994, AAAI.

[12]  K. Haigh,et al.  A Layered Architecture for Ooce Delivery Robots , 1997 .

[13]  Manuela M. Veloso,et al.  Planning and Learning by Analogical Reasoning , 1994, Lecture Notes in Computer Science.

[14]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[15]  Richard A. Becker,et al.  The New S Language , 1989 .

[16]  Karen Zita Haigh,et al.  High-level planning and low-level execution: towards a complete robotic agent , 1997, AGENTS '97.

[17]  Karen Zita Haigh,et al.  A layered architecture for office delivery robots , 1997, AGENTS '97.

[18]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[19]  Min Zhao,et al.  Mobile manipulator path planning by a genetic algorithm , 1994, J. Field Robotics.

[20]  Reid G. Simmons,et al.  Probabilistic Robot Navigation in Partially Observable Environments , 1995, IJCAI.

[21]  Manuela Veloso,et al.  Incremental Learning of Control Knowledge for Improvement of Planning Efficiency and Plan Quality , 1994 .

[22]  Reid G. Simmons,et al.  Passive Distance Learning for Robot Navigation , 1996, ICML.

[23]  Karen Zita Haigh,et al.  Exploiting domain geometry in analogical route planning , 1997, J. Exp. Theor. Artif. Intell..

[24]  Alan D. Christiansen,et al.  Experiments in Robot Learning , 1989, ML.

[25]  Douglas J. Pearson Learning Procedural Planning Knowledge in Complex Environments , 1996, AAAI/IAAI, Vol. 2.

[26]  Jim Blythe,et al.  Planning with External Events , 1994, UAI.

[27]  Katharina Morik,et al.  Learning Concepts from Sensor Data of a Mobile Robot , 2005, Machine Learning.

[28]  W. Lovejoy A survey of algorithmic methods for partially observed Markov decision processes , 1991 .

[29]  Leslie Pack Kaelbling,et al.  Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[30]  Andrew W. Moore,et al.  Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[31]  Sven Koenig,et al.  Goal-Directed Acting with Incomplete Information , 1997 .

[32]  Karen Zita Haigh,et al.  A Layered Architecture for O ce Delivery Robots , 1997 .

[33]  Richard Goodwin Meta-Level Control for Decision-Theoretic Planners , 1996 .

[34]  R. R. Murphy,et al.  Learning the expected utility of sensors and algorithms , 1994, Proceedings of 1994 IEEE International Conference on MFI '94. Multisensor Fusion and Integration for Intelligent Systems.

[35]  Leemon C. Baird,et al.  Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.

[36]  Wei-Min Shen,et al.  Autonomous learning from the environment , 1994 .

[37]  Tom M. Mitchell,et al.  Experience with a learning personal assistant , 1994, CACM.

[38]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[39]  Andrew McCallum,et al.  Reinforcement learning with selective perception and hidden state , 1996 .

[40]  Yolanda Gil,et al.  Acquiring domain knowledge for planning by experimentation , 1992 .

[41]  Sebastian Thrun,et al.  A Bayesian Approach to Landmark Discovery and Active Perception in Mobile Robot Navigation , 1999 .

[42]  Ming Tan,et al.  Cost-sensitive robot learning , 1991 .

[43]  Maria Alicia Perez Learning search control knowledge to improve plan quality , 1996 .

[44]  Dean A. Pomerleau,et al.  Neural Network Perception for Mobile Robot Guidance , 1993 .