An Architecture for Structured, Concurrent, Real-Time Action

I present a computational architecture designed to capture certain properties essential to actions, including compositionality, concurrency, quick reactions, and resilience in the face of unexpected events. It uses a structured internal state model and complex inference about the environment to inform decision-making. The properties above are achieved by combining interacting procedural and probabilistic representations, so that the structure of actions is captured by Petri Nets, which are informed by, and affect, a model of the world represented as a Probabilistic Relational Model. I give both a theoretical analysis of the architecture and a demonstration of its use in a simulated robotic environment.

[1]  Tadao Murata,et al.  Petri nets: Properties, analysis and applications , 1989, Proc. IEEE.

[2]  Falko Bause,et al.  Stochastic Petri Nets , 1996 .

[3]  Pieter Abbeel,et al.  Apprenticeship learning and reinforcement learning with application to robotic control , 2008 .

[4]  Shimon Whiteson,et al.  Concurrent layered learning , 2003, AAMAS '03.

[5]  C. Petri Kommunikation mit Automaten , 1962 .

[6]  Bhaskara Marthi,et al.  Concurrent Hierarchical Reinforcement Learning , 2005, IJCAI.

[7]  G. Michele Pinna,et al.  Synthesis of Nets with Inhibitor Arcs , 1997, CONCUR.

[8]  Collin F. Baker,et al.  Frame semantics for text understanding , 2001 .

[9]  Tak Kuen Siu,et al.  Markov Chains: Models, Algorithms and Applications , 2006 .

[10]  Jiacun Wang,et al.  Timed Petri Nets: Theory and Application , 1998 .

[11]  Michael A. Harrison,et al.  Introduction to switching and automata theory , 1965 .

[12]  R. A. Brooks,et al.  Intelligence without Representation , 1991, Artif. Intell..

[13]  Pedro M. Domingos,et al.  Relational Markov models and their application to adaptive web navigation , 2002, KDD.

[14]  Srini Narayanan,et al.  Learning all optimal policies with multiple criteria , 2008, ICML '08.

[15]  Qiang Yang,et al.  Formalizing planning knowledge for hierarchical planning , 1990, Comput. Intell..

[16]  Wolfgang Reisig,et al.  The synthesis problem of Petri nets , 1993, Acta Informatica.

[17]  Eric A. Hansen,et al.  Solving POMDPs by Searching in Policy Space , 1998, UAI.

[18]  Nevin Lianwen Zhang,et al.  Exploiting Causal Independence in Bayesian Network Inference , 1996, J. Artif. Intell. Res..

[19]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[20]  Edmund H. Durfee,et al.  Abstract Reasoning for Planning and Coordination , 2002, SARA.

[21]  Avi Pfeffer,et al.  Object-Oriented Bayesian Networks , 1997, UAI.

[22]  Bernhard Hengst Partial Order Hierarchical Reinforcement Learning , 2008, Australasian Conference on Artificial Intelligence.

[23]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[24]  A. Doucet,et al.  A Tutorial on Particle Filtering and Smoothing: Fifteen years later , 2008 .

[25]  David Chapman,et al.  Pengi: An Implementation of a Theory of Activity , 1987, AAAI.

[26]  Martin Davies,et al.  Folk Psychology and Mental Simulation , 1998, Royal Institute of Philosophy Supplement.

[27]  Pedro M. Domingos,et al.  Dynamic Probabilistic Relational Models , 2003, IJCAI.

[28]  Rina Dechter,et al.  Tree approximation for belief updating , 2002, AAAI/IAAI.

[29]  Michael C. Horsch,et al.  Dynamic Bayesian networks , 1990 .

[30]  Bernhard Hengst,et al.  Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.

[31]  A. Doucet,et al.  Monte Carlo Smoothing for Nonlinear Time Series , 2004, Journal of the American Statistical Association.

[32]  S. Sastry Nonlinear Systems: Analysis, Stability, and Control , 1999 .

[33]  Stuart J. Russell,et al.  Angelic Hierarchical Planning: Optimal and Online Algorithms , 2008, ICAPS.

[34]  Andrew McCallum,et al.  Introduction to Statistical Relational Learning , 2007 .

[35]  Stuart J. Russell,et al.  Combined Task and Motion Planning for Mobile Manipulation , 2010, ICAPS.

[36]  E. B. Wilson Probable Inference, the Law of Succession, and Statistical Inference , 1927 .

[37]  Jerome A. Feldman,et al.  From Molecule to Metaphor - A Neural Theory of Language , 2006 .

[38]  Leslie Pack Kaelbling,et al.  Lifted Probabilistic Inference with Counting Formulas , 2008, AAAI.

[39]  Avi Pfeffer,et al.  Probabilistic Frame-Based Systems , 1998, AAAI/IAAI.

[40]  Srinivas Narayanan,et al.  Reasoning About Actions in Narrative Understanding , 1999, IJCAI.

[41]  A. Agresti,et al.  Approximate is Better than “Exact” for Interval Estimation of Binomial Proportions , 1998 .

[42]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[43]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[44]  Fengzhan Tian,et al.  A DBN inference algorithm using junction tree , 2004, Fifth World Congress on Intelligent Control and Automation (IEEE Cat. No.04EX788).

[45]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[46]  Milos Hauskrecht,et al.  Planning and control in stochastic domains with imperfect information , 1997 .

[47]  Quan Pan,et al.  Learning Dynamic Bayesian Networks Structure Based on Bayesian Optimization Algorithm , 2007, ISNN.

[48]  Kee-Eung Kim,et al.  Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.

[49]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[50]  Kee-Eung Kim,et al.  Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.

[51]  Wolfgang Reisig,et al.  Place or Transition Petri Nets , 1996, Petri Nets.

[52]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[53]  D. C. Cooper,et al.  Sequential Machines and Automata Theory , 1968, Comput. J..

[54]  Leslie Pack Kaelbling,et al.  Learning Hierarchical Structure in Policies , 2007, NIPS 2007.

[55]  Timothy J. Robinson,et al.  Sequential Monte Carlo Methods in Practice , 2003 .

[56]  B. Habibi,et al.  Pengi : An Implementation of A Theory of Activity , 1998 .

[57]  Michael P. Wellman,et al.  On state-space abstraction for anytime evaluation of Bayesian networks , 1996, SGAR.

[58]  J. Feldman,et al.  Karma: knowledge-based active representations for metaphor and aspect , 1997 .

[59]  Stuart J. Russell,et al.  BLOG: Probabilistic Models with Unknown Objects , 2005, IJCAI.

[60]  Hiroaki Kitano,et al.  The RoboCup Synthetic Agent Challenge 97 , 1997, IJCAI.

[61]  Stuart J. Russell,et al.  Angelic Semantics for High-Level Actions , 2007, ICAPS.

[62]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[63]  Richard Fikes,et al.  STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[64]  Dan Roth,et al.  Lifted First-Order Probabilistic Inference , 2005, IJCAI.

[65]  René David,et al.  On Hybrid Petri Nets , 2001, Discret. Event Dyn. Syst..

[66]  Rina Dechter,et al.  Bucket elimination: A unifying framework for probabilistic inference , 1996, UAI.

[67]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[68]  Adnan Darwiche,et al.  Inference in belief networks: A procedural guide , 1996, Int. J. Approx. Reason..