Algorithms for Planning under Uncertainty in Prediction and Sensing

For mobile robots, uncertainty is everywhere. Wheels slip. Sensors are affected by noise. Obstacles move unpredictably. Truly autonomous robots (and decision-makers or agents in general) must act in ways that are robust to these sorts of failures and unexpected events which we may think of in general as uncertainty. In this chapter, we attempt to meet uncertainty head-on by explicitly modeling it and reasoning about it. We use the term decision theoretic planning to refer to this broad class of planning methods characterized by explicit accounting for uncertainty. We will consider a number of formulations for the problem of planning under uncertainty and present algorithms for planning under these formulations. Uncertainty can take many forms, but for brevity and clarity we will restrict our attention to only two important types:

[1]  Gregory Dudek,et al.  Localizing a robot with minimum travel , 1995, SODA '95.

[2]  Wolfram Burgard,et al.  Position Estimation for Mobile Robots in Dynamic Environments , 1998, AAAI/IAAI.

[3]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[4]  Huibert Kwakernaak,et al.  Linear Optimal Control Systems , 1972 .

[5]  Wenju Liu,et al.  A Model Approximation Scheme for Planning in Partially Observable Stochastic Domains , 1997, J. Artif. Intell. Res..

[6]  Tara A. Estlin,et al.  CLARAty and challenges of developing interoperable robotic software , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[7]  Harald Niederreiter,et al.  Random number generation and Quasi-Monte Carlo methods , 1992, CBMS-NSF regional conference series in applied mathematics.

[8]  J. W. Nieuwenhuis,et al.  Boekbespreking van D.P. Bertsekas (ed.), Dynamic programming and optimal control - volume 2 , 1999 .

[9]  Jason M. O'Kane,et al.  Almost-Sensorless Localization , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[10]  Kyung-Yong Chwa,et al.  Visibility-Based Pursuit-Evasion in a Polygonal Region by a Searcher , 2001, ICALP.

[11]  D. Moore Simplicial Mesh Generation with Applications , 1992 .

[12]  R. Simmons,et al.  Probabilistic Navigation in Partially Observable Environments , 1995 .

[13]  Tara Estlin,et al.  CLARAty: an architecture for reusable robotic software , 2003, SPIE Defense + Commercial Sensing.

[14]  Kenneth J. Arrow,et al.  Studies in Resource Allocation Processes: Appendix: An optimality criterion for decision-making under ignorance , 1977 .

[15]  J. Berger PURSUIT-EVASION DIFFERENTIAL GAMES , 1968 .

[16]  Howie Choset,et al.  Sensor Based Planing, Part I: The Generalized Voronoi Graph , 1995, ICRA.

[17]  Ronen I. Brafman,et al.  R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[18]  Y. Wang,et al.  An Historical Overview of Lattice Point Sets , 2002 .

[19]  Gregory Dudek,et al.  Randomized Algorithms for Minimum Distance Localization , 2007, Int. J. Robotics Res..

[20]  Bruce Randall Donald,et al.  Sensor interpretation and task-directed planning using perceptual equivalence classes , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.

[21]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[22]  Dimitri P. Bertsekas,et al.  Distributed asynchronous computation of fixed points , 1983, Math. Program..

[23]  A. G. Sukharev Optimal strategies of the search for an extremum , 1971 .

[24]  Milos Hauskrecht,et al.  Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..

[25]  Matthew T. Mason,et al.  Using Partial Sensor Information to Orient Parts , 1999, Int. J. Robotics Res..

[26]  Kevin M. Lynch Sensorless parts feeding with a one joint robot , 1996 .

[27]  Scott Davies,et al.  Multidimensional Triangulation and Interpolation for Reinforcement Learning , 1996, NIPS.

[28]  Benjamin Kuipers,et al.  A Logical Account of Causal and Topological Maps , 2001, IJCAI.

[29]  Peter Norvig,et al.  Artificial intelligence - a modern approach, 2nd Edition , 2003, Prentice Hall series in artificial intelligence.

[30]  Bruce Randall Donald,et al.  On Information Invariants in Robotics , 1995, Artif. Intell..

[31]  Patric Jensfelt,et al.  Using multiple Gaussian hypotheses to represent probability distributions for mobile robot localization , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[32]  Ronald A. Howard,et al.  Dynamic Programming and Markov Processes , 1960 .

[33]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[34]  Bruce Randall Donald,et al.  A Geometric Approach to Error Detection and Recovery for Robot Motion Planning with Uncertainty , 1987, Artif. Intell..

[35]  Masafumi Yamashita,et al.  Searching for a Mobile Intruder in a Polygonal Region , 1992, SIAM J. Comput..

[36]  Philip D. Straffin,et al.  Game theory and strategy , 1993 .

[37]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[38]  Salvador Barberà,et al.  Handbook of Utility Theory Volume 1: Principles , 1998 .

[39]  I. Sloan Lattice Methods for Multiple Integration , 1994 .

[40]  Christian P. Robert,et al.  The Bayesian choice , 1994 .

[41]  J. Burdick,et al.  Sensor based planning. I. The generalized Voronoi graph , 1995, Proceedings of 1995 IEEE International Conference on Robotics and Automation.

[42]  S. LaValle,et al.  Efficient computation of optimal navigation functions for nonholonomic planning , 1999, Proceedings of the First Workshop on Robot Motion and Control. RoMoCo'99 (Cat. No.99EX353).

[43]  Karen Zita Haigh,et al.  A layered architecture for office delivery robots , 1997, AGENTS '97.

[44]  Reid G. Simmons,et al.  Probabilistic Robot Navigation in Partially Observable Environments , 1995, IJCAI.

[45]  Kenneth Y. Goldberg,et al.  Bayesian grasping , 1990, Proceedings., IEEE International Conference on Robotics and Automation.

[46]  J. Walrand,et al.  Distributed Dynamic Programming , 2022 .

[47]  Steven M. LaValle,et al.  Algorithms for Computing Numerical Optimal Feedback Motion Strategies , 2001, Int. J. Robotics Res..

[48]  Wolfram Burgard,et al.  Particle Filters for Mobile Robot Localization , 2001, Sequential Monte Carlo Methods in Practice.

[49]  Anthony Stentz,et al.  Optimal and efficient path planning for partially-known environments , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.

[50]  Ehud Rivlin,et al.  Range-sensor based navigation in three dimensions , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[51]  T. Başar,et al.  Dynamic Noncooperative Game Theory , 1982 .

[52]  Stefan Edelkamp,et al.  Automated Planning: Theory and Practice , 2007, Künstliche Intell..

[53]  Milos Hauskrecht,et al.  Incremental Methods for Computing Bounds in Partially Observable Markov Decision Processes , 1997, AAAI/IAAI.

[54]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[55]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[56]  Reid G. Simmons,et al.  GRACE: An Autonomous Robot for the AAAI Robot Challenge , 2003, AI Mag..

[57]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[58]  E. Angel,et al.  Principles of dynamic programming part 1 , 1980, Proceedings of the IEEE.

[59]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[60]  Kenneth Y. Goldberg,et al.  Orienting polygonal parts without sensors , 1993, Algorithmica.

[61]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[62]  Abraham Wald,et al.  Statistical Decision Functions , 1951 .

[63]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[64]  G. Della Riccia Planning Based on Decision Theory , 2003, International Centre for Mechanical Sciences.

[65]  T. D. Parsons,et al.  Pursuit-evasion in a graph , 1978 .

[66]  R. Larson,et al.  A survey of dynamic programming computational procedures , 1967, IEEE Transactions on Automatic Control.

[67]  Steven M. LaValle,et al.  Pursuit-evasion in an unknown environment using gap navigation trees , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[68]  Robert E. Larson,et al.  Principles of Dynamic Programming , 1978 .

[69]  Karen Zita Haigh,et al.  High-level planning and low-level execution: towards a complete robotic agent , 1997, AGENTS '97.

[70]  Paolo Traverso,et al.  Automated planning - theory and practice , 2004 .

[71]  Manuel Blum,et al.  On the power of the compass (or, why mazes are easier to search than graphs) , 1978, 19th Annual Symposium on Foundations of Computer Science (sfcs 1978).

[72]  M. Degroot Optimal Statistical Decisions , 1970 .

[73]  Steven M. LaValle,et al.  Optimal navigation and object finding without geometric maps or localization , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[74]  P. B. Coaker,et al.  Applied Dynamic Programming , 1964 .

[75]  Leonidas J. Guibas,et al.  Visibility-Based Pursuit-Evasion in a Polygonal Environment , 1997, WADS.

[76]  Vladimir J. Lumelsky,et al.  Path-planning strategies for a point mobile automaton moving amidst unknown obstacles of arbitrary shape , 1987, Algorithmica.

[77]  Dimitri P. Bertsekas,et al.  Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[78]  Andrew W. Moore,et al.  Barycentric Interpolators for Continuous Space and Time Reinforcement Learning , 1998, NIPS.

[79]  E. Rowland Theory of Games and Economic Behavior , 1946, Nature.

[80]  David K. Smith,et al.  Dynamic Programming and Optimal Control. Volume 1 , 1996 .

[81]  Russell H. Taylor,et al.  Sensor-based manipulation planning as a game with nature , 1988 .

[82]  Chris Urmson,et al.  A generic framework for robotic navigation , 2003, 2003 IEEE Aerospace Conference Proceedings (Cat. No.03TH8652).

[83]  Russell H. Taylor,et al.  Automatic Synthesis of Fine-Motion Strategies for Robots , 1984 .

[84]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[85]  Matthew T. Mason,et al.  An exploration of sensorless manipulation , 1986, IEEE J. Robotics Autom..