Minimax real-time heuristic search

Abstract Real-time heuristic search methods interleave planning and plan executions and plan only in the part of the domain around the current state of the agents. So far, real-time heuristic search methods have mostly been applied to deterministic planning tasks. In this article, we argue that real-time heuristic search methods can efficiently solve nondeterministic planning tasks. We introduce Min-Max Learning Real-Time A∗ (Min-Max LRTA∗), a real-time heuristic search method that generalizes Korf's LRTA∗ to nondeterministic domains, and apply it to robot-navigation tasks in mazes, where the robots know the maze but do not know their initial position and orientation (pose). These planning tasks can be modeled as planning tasks in nondeterministic domains whose states are sets of poses. We show that Min-Max LRTA∗ solves the robot-navigation tasks fast, converges quickly, and requires only a small amount of memory.

[1]  Richard S. Sutton,et al.  Learning and Sequential Decision Making , 1989 .

[2]  Kevin Knight,et al.  Are Many Reactive Agents Better Than a Few Deliberative Ones? , 1993, IJCAI.

[3]  Leslie Pack Kaelbling,et al.  Planning under Time Constraints in Stochastic Domains , 1993, Artif. Intell..

[4]  A. Cassandra,et al.  Exact and approximate algorithms for partially observable markov decision processes , 1998 .

[5]  Jesfis Peral,et al.  Heuristics -- intelligent search strategies for computer problem solving , 1984 .

[6]  Toru Ishida,et al.  Real-time Planning by Interleaving Real-time Search with Subgoaling , 1994, AIPS.

[7]  Blai Bonet,et al.  Planning with Incomplete Information as Heuristic Search in Belief Space , 2000, AIPS.

[8]  Richard E. Korf,et al.  Real-Time Heuristic Search: First Results , 1987, AAAI.

[9]  Robin R. Murphy,et al.  Artificial intelligence and mobile robots: case studies of successful robot systems , 1998 .

[10]  Blai Bonet,et al.  A Robust and Fast Action Selection Mechanism for Planning , 1997, AAAI/IAAI.

[11]  Illah R. Nourbakhsh,et al.  Assumptive planning and execution: A simple, working robot architecture , 1996, Auton. Robots.

[12]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[13]  Marco Roveri,et al.  Recent Advances in AI Planning , 1999, Lecture Notes in Computer Science.

[14]  Michael R. Genesereth,et al.  Interleaving planning and execution , 1996 .

[15]  Toru Ishida,et al.  Two is Not Always Better than One: Experiences in Real-Time Bidirectional Search , 1995, ICMAS.

[16]  Reid G. Simmons,et al.  Probabilistic Robot Navigation in Partially Observable Environments , 1995, IJCAI.

[17]  Illah Nourbakhsh Interleaving Planning and Execution for Autonomous Robots , 1997 .

[18]  Sven Koenig,et al.  Gridworlds as Testbeds for Planning with Incomplete Information , 2000, AAAI/IAAI.

[19]  Sebastian Thrun,et al.  The role of exploration in learning control , 1992 .

[20]  Griff L. Bilbro Stochastic search , 1994, Defense + Commercial Sensing.

[21]  Sridhar Mahadevan,et al.  Rapid Concept Learning for Mobile Robots , 2004, Machine Learning.

[22]  Shlomo Zilberstein,et al.  Operational Rationality through Compilation of Anytime Algorithms , 1995, AI Mag..

[23]  Mark S. Boddy,et al.  Solving Time-Dependent Planning Problems , 1989, IJCAI.

[24]  Matthias Heger The Loss from Imperfect Value Functions in Expectation-Based and Minimax-Based Tasks , 1996, Machine Learning.

[25]  Richard E. Korf,et al.  Moving Target Search , 1991, IJCAI.

[26]  Illah R. Nourbakhsh,et al.  Time-Saving Tips for Problem Solving with Incomplete Information , 1993, AAAI.

[27]  Anthony Stentz,et al.  The Focussed D* Algorithm for Real-Time Replanning , 1995, IJCAI.

[28]  Robert E. Schapire,et al.  Design and analysis of efficient learning algorithms , 1992, ACM Doctoral dissertation award ; 1991.

[29]  Richard E. Korf,et al.  Real-Time Heuristic Search , 1990, Artif. Intell..

[30]  Stuart J. Russell Efficient Memory-Bounded Search Methods , 1992, ECAI.

[31]  Steven Douglas Whitehead,et al.  Reinforcement learning for the adaptive control of perception and action , 1992 .

[32]  Stuart J. Russell,et al.  Do the right thing - studies in limited rationality , 1991 .

[33]  Leslie Pack Kaelbling,et al.  Acting under uncertainty: discrete Bayesian models for mobile-robot navigation , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[34]  石田 亨 Real-time search for learning autonomous agents , 1997 .

[35]  Eric A. Hansen,et al.  Solving POMDPs by Searching in Policy Space , 1998, UAI.

[36]  Reid G. Simmons,et al.  Real-Time Search in Non-Deterministic Domains , 1995, IJCAI.

[37]  Toru Ishida,et al.  Moving Target Search with Intelligence , 1992, AAAI.

[38]  Donald A. Sofge,et al.  Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , 1992 .

[39]  Nir Friedman,et al.  Qualitative Planning under Assumptions: A Preliminary Report , 1994 .

[40]  Richard E. Korf,et al.  Real-time heuristic search: new results , 1988, AAAI 1988.

[41]  Fausto Giunchiglia,et al.  Planning as Model Checking , 1999, ECP.

[42]  R. Korf,et al.  Incremental path planning on graphs with cycles , 1992 .

[43]  Russell H. Taylor,et al.  Automatic Synthesis of Fine-Motion Strategies for Robots , 1984 .

[44]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[45]  Andrew W. Moore,et al.  The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.

[46]  Eric Horvitz,et al.  Reflection and Action Under Scarce Resources: Theoretical Principles and Empirical Study , 1989, IJCAI.

[47]  Shlomo Zilberstein,et al.  Composing Real-Time Systems , 1991, IJCAI.

[48]  Michael L. Littman,et al.  Algorithms for Sequential Decision Making , 1996 .

[49]  Kurt Konolige,et al.  Markov Localization using Correlation , 1999, IJCAI.

[50]  Wolfram Burgard,et al.  A Probabilistic Approach to Concurrent Mapping and Localization for Mobile Robots , 1998, Auton. Robots.

[51]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[52]  Csaba Szepesvári,et al.  Learning and Exploitation Do Not Conflict Under Minimax Optimality , 1997, ECML.

[53]  Reid G. Simmons,et al.  Solving Robot Navigation Problems with Initial Pose Uncertainty Using Real-Time Heuristic Search , 1998, AIPS.

[54]  Blai Bonet,et al.  Planning as heuristic search , 2001, Artif. Intell..

[55]  Richard E. Korf,et al.  Linear-Space Best-First Search , 1993, Artif. Intell..