Integrating Relational Reinforcement Learning with Reasoning about Actions and Change

This paper presents an approach to the integration of Relational Reinforcement Learning with Answer Set Programming and the Event Calculus. Our framework allows for background and prior knowledge formulated in a semantically expressive formal language and facilitates the computationally efficient constraining of the learning process by means of soft as well as compulsive (sub-)policies and (sub-)plans generated by an ASP-solver. As part of this, a new planning-based approach to Relational Instance-Based Learning is proposed. An empirical evaluation of our approach shows a significant improvement of learning efficiency and learning results in various benchmark settings.

[1]  Enrico Giunchiglia,et al.  An Action Language Based on Causal Explanation: Preliminary Report , 1998, AAAI/IAAI.

[2]  Stefan Woltran,et al.  Special issue on answer set programming , 2011, AI Commun..

[3]  Ben Taskar,et al.  Introduction to statistical relational learning , 2007 .

[4]  Stephen Muggleton,et al.  Learning Programs in the Event Calculus , 1997, ILP.

[5]  Vladimir Lifschitz,et al.  Artificial intelligence and mathematical theory of computation: papers in honor of John McCarthy , 1991 .

[6]  Christian Freksa,et al.  KI 2006: Advances in Artificial Intelligence, 29th Annual German Conference on AI, KI 2006, Bremen, Germany, June 14-17, 2006, Proceedings , 2007, KI.

[7]  Enrico Giunchiglia,et al.  Planning as Satisfiability in Nondeterministic Domains , 2000, AAAI/IAAI.

[8]  Murray Shanahan,et al.  A Circumscriptive Calculus of Events , 1995, Artif. Intell..

[9]  Marek J. Sergot,et al.  A logic-based calculus of events , 1989, New Generation Computing.

[10]  Marius Thomas Lindauer,et al.  Potassco: The Potsdam Answer Set Solving Collection , 2011, AI Commun..

[11]  Maurice Bruynooghe,et al.  A polynomial time computable metric between point sets , 2001, Acta Informatica.

[12]  Charles Gretton Gradient-Based Relational Reinforcement Learning of Temporally Extended Policies , 2007, ICAPS.

[13]  Ioan Alfred Letia,et al.  Developing Collaborative Golog Agents by Reinforcement Learning , 2002, Int. J. Artif. Intell. Tools.

[14]  Thomas Lukasiewicz,et al.  Adaptive Multi-agent Programming in GTGolog , 2006, KI.

[15]  D. Bryce POND : The Partially-Observable and Non-Deterministic Planner , 2006 .

[16]  Hector Geffner,et al.  Learning Generalized Policies from Planning Examples Using Concept Languages , 2004, Applied Intelligence.

[17]  Benjamin Goudey,et al.  A comparison of Situation Calculus and Event Calculus Benjamin Goudey , 2007 .

[18]  Gerhard Lakemeyer,et al.  Reinforcement Learning for Golog Programs , 2009 .

[19]  Erik T. Mueller,et al.  Event Calculus Reasoning Through Satisfiability , 2004, J. Log. Comput..

[20]  Maurice Bruynooghe,et al.  Towards Informed Reinforcement Learning , 2004, ICML 2004.

[21]  Mark Witkowski,et al.  Event Calculus Planning Through Satisfiability , 2004, J. Log. Comput..

[22]  Raymond Reiter,et al.  The Frame Problem in the Situation Calculus: A Simple Solution (Sometimes) and a Completeness Result for Goal Regression , 1991, Artificial and Mathematical Theory of Computation.

[23]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[24]  Kurt Driessens,et al.  Relational Reinforcement Learning , 1998, Machine-mediated learning.

[25]  Céline Rouveirol,et al.  Relational TD Reinforcement Learning , 2008 .

[26]  Martijn van Otterlo Thesis review: Relational Reinforcement Learning / by Kurt Driessens. - Thesis Katholieke Universiteit Leuven , 2004 .

[27]  Craig Boutilier,et al.  Symbolic Dynamic Programming for First-Order MDPs , 2001, IJCAI.

[28]  Joohyung Lee,et al.  Circumscriptive Event Calculus as Answer Set Programming , 2009, IJCAI.

[29]  Luc De Raedt,et al.  Relational Reinforcement Learning , 2001, Machine Learning.

[30]  Thomas G. Dietterich Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[31]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[32]  Luc De Raedt,et al.  Logical Markov Decision Programs , 2003 .

[33]  Claude Sammut,et al.  Hierarchical reinforcement learning: a hybrid approach , 2004 .

[34]  Ioan Alfred Letia,et al.  Developing collaborative Golog agents by reinforcement learning , 2001, Proceedings 13th IEEE International Conference on Tools with Artificial Intelligence. ICTAI 2001.

[35]  Ben Taskar,et al.  Reinforcement Learning in Relational Domains: A Policy-Language Approach , 2007 .