Logical Markov Decision Programs

Motivated by the interest in relational reinforcement learning, we introduce a novel representation formalism, called logical Markov decision programs (LOMDPs), that integrates Markov Decision Processes with Logic Programs. Using LOMDPs one can compactly and declaratively represent complex relational Markov decision processes. Within this framework we then develop a theory of reinforcement learning in which abstraction (of states and actions) plays a major role. The framework presented should provide a basis for further developments in relational reinforcement learning.

[1]  Eric B. Baum,et al.  Toward a Model of Intelligence as an Economy of Agents , 1999, Machine Learning.

[2]  Dana H. Ballard,et al.  Learning to perceive and act by trial and error , 1991, Machine Learning.

[3]  Luc De Raedt,et al.  Towards Discovering Structural Signatures of Protein Folds Based on Logical Hidden Markov Models , 2003, Pacific Symposium on Biocomputing.

[4]  Michael I. Jordan,et al.  MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[5]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[6]  De,et al.  Relational Reinforcement Learning , 2022 .

[7]  Drew McDermott,et al.  Modeling a Dynamic and Uncertain World I: Symbolic and Probabilistic Reasoning About Change , 1994, Artif. Intell..

[8]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[10]  David Andre,et al.  Programmable Reinforcement Learning Agents , 2000, NIPS.

[11]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[12]  Luc De Raedt,et al.  Towards Combining Inductive Logic Programming with Bayesian Networks , 2001, ILP.

[13]  Balaraman Ravindran,et al.  Model Minimization in Hierarchical Reinforcement Learning , 2002, SARA.

[14]  Thomas G. Dietterich Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[15]  Robert Givan,et al.  Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..

[16]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[17]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[18]  Craig Boutilier,et al.  Stochastic dynamic programming with factored representations , 2000, Artif. Intell..

[19]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[20]  Craig Boutilier,et al.  Symbolic Dynamic Programming for First-Order MDPs , 2001, IJCAI.

[21]  Andrew McCallum,et al.  Reinforcement learning with selective perception and hidden state , 1996 .

[22]  Tim Oates,et al.  The Thing that we Tried Didn't Work very Well: Deictic Representation in Reinforcement Learning , 2002, UAI.

[23]  Craig Boutilier,et al.  Abstraction and Approximate Decision-Theoretic Planning , 1997, Artif. Intell..

[24]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[25]  John K. Slaney,et al.  Blocks World revisited , 2001, Artif. Intell..

[26]  S. Muggleton Stochastic Logic Programs , 1996 .

[27]  Pedro M. Domingos,et al.  Relational Markov models and their application to adaptive web navigation , 2002, KDD.