Intelligent exploration of unknown environments with vision like sensors

In this work we present a methodology for intelligent path planning in an uncertain environment using vision like sensors. We show that the problem of path planning can be posed as the adaptive control of an uncertain Markov decision process. The strategy for path planning then reduces to computing the control policy based on the current estimate of the environment, also known as the "certainty equivalence" principle in the adaptive control literature. We propose a Monte Carlo based estimation scheme, incorporating non local sensors, for estimating the probabilities of the environment process, which significantly accelerates the convergence of the associated path planning algorithms

[1]  Sebastian Thrun,et al.  A Probabilistic On-Line Mapping Algorithm for Teams of Mobile Robots , 2001, Int. J. Robotics Res..

[2]  R.J. Williams,et al.  Reinforcement learning is direct adaptive optimal control , 1991, IEEE Control Systems.

[3]  Weiguo Yang,et al.  Convergence in the Cesàro sense and strong law of large numbers for nonhomogeneous Markov chains , 2002 .

[4]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[5]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[6]  Suman Chakravorty,et al.  Modeling of Image Formation in Multi-Spacecraft Interferometric Imaging Systems , 2004 .

[7]  V. Borkar,et al.  Adaptive control of Markov chains, I: Finite parameter set , 1979 .

[8]  J. M. M. Montiel,et al.  The SPmap: a probabilistic framework for simultaneous localization and map building , 1999, IEEE Trans. Robotics Autom..

[9]  Pravin Varaiya,et al.  Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .

[10]  P. Mandl,et al.  Estimation and control in Markov chains , 1974, Advances in Applied Probability.

[11]  V. Borkar,et al.  Adaptive control of Markov chains, I: Finite parameter set , 1979, 1979 18th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.

[12]  V. Borkar,et al.  Adaptive control of Markov chains , 1979 .

[13]  Steven M. LaValle,et al.  Robot Motion Planning: A Game-Theoretic Foundation , 2000, Algorithmica.

[14]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15]  Richard W. Madsen,et al.  Markov Chains: Theory and Applications , 1976 .

[16]  Wolfram Burgard,et al.  Sonar-Based Mapping of Large-Scale Mobile Robot Environments using EM , 1999, ICML.

[17]  Huosheng Hu,et al.  Dynamic global path planning with uncertainty for mobile robots in manufacturing , 1997, IEEE Trans. Robotics Autom..

[18]  John J. Leonard,et al.  A Computationally Efficient Method for Large-Scale Concurrent Mapping and Localization , 2000 .

[19]  Drew McDermott,et al.  Planning and Acting , 1978, Cogn. Sci..

[20]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[21]  Sebastian Thrun,et al.  Probabilistic Algorithms in Robotics , 2000, AI Mag..

[22]  G. Swaminathan Robot Motion Planning , 2006 .

[23]  Matthew T. Mason,et al.  Automatic planning of fine motions: Correctness and completeness , 1984, ICRA.

[24]  Jean-Claude Latombe,et al.  Robot Motion Planning with Uncertainty in Control and Sensing , 1991, Artif. Intell..

[25]  O. Hernández-Lerma,et al.  Markov chains and invariant probabilities , 2003 .

[26]  Kimon P. Valavanis,et al.  Evolutionary algorithm based offline/online path planner for UAV navigation , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[27]  Dean Isaacson,et al.  Markov Chains: Theory and Applications , 1976 .