Probabilistic Backward and Forward Reasoning in Stochastic Relational Worlds

Inference in graphical models has emerged as a promising technique for planning. A recent approach to decision-theoretic planning in relational domains uses forward inference in dynamic Bayesian networks compiled from learned probabilistic relational rules. Inspired by work in non-relational domains with small state spaces, we derive a back-propagation method for such nets in relational domains starting from a goal state mixture distribution. We combine this with forward reasoning in a bidirectional two-filter approach. We perform experiments in a complex 3D simulated desktop environment with an articulated manipulator and realistic physics. Empirical results show that bidirectional probabilistic reasoning can lead to more efficient and accurate planning in comparison to pure forward reasoning.

[1]  Craig Boutilier,et al.  Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[2]  Nicholas Kushmerick,et al.  An Algorithm for Probabilistic Planning , 1995, Artif. Intell..

[3]  Craig Boutilier,et al.  Symbolic Dynamic Programming for First-Order MDPs , 2001, IJCAI.

[4]  Jussi Rintanen,et al.  Regression for Classical and Nondeterministic Planning , 2008, ECAI.

[5]  Marc Toussaint,et al.  Approximate inference for planning in stochastic relational worlds , 2009, ICML '09.

[6]  Tōkei Sūri Kenkyūjo Annals of the Institute of Statistical Mathematics , 1949 .

[7]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[8]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[9]  Yishay Mansour,et al.  A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.

[10]  L. P. Kaelbling,et al.  Learning Symbolic Models of Stochastic Domains , 2007, J. Artif. Intell. Res..

[11]  Kevin P. Murphy,et al.  The Factored Frontier Algorithm for Approximate Inference in DBNs , 2001, UAI.

[12]  Leslie Pack Kaelbling,et al.  Action-Space Partitioning for Planning , 2007, AAAI.

[13]  Marc Toussaint,et al.  Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.

[14]  Barton C. Massey,et al.  Directions in planning: understanding the flow of time in planning , 1999 .

[15]  Victor Solo,et al.  Smoothing estimation of stochastic processes: Two-filter formulas , 1982 .

[16]  Kurt Driessens,et al.  Relational Reinforcement Learning , 1998, Machine-mediated learning.

[17]  Luc De Raedt,et al.  Bellman goes relational , 2004, ICML.

[18]  Scott Sanner,et al.  Practical solution techniques for first-order MDPs , 2009, Artif. Intell..

[19]  A. Doucet,et al.  Smoothing algorithms for state–space models , 2010 .