Graph kernels and Gaussian processes for relational reinforcement learning

RRL is a relational reinforcement learning system based on Q-learning in relational state-action spaces. It aims to enable agents to learn how to act in an environment that has no natural representation as a tuple of constants. For relational reinforcement learning, the learning algorithm used to approximate the mapping between state-action pairs and their so called Q(uality)-value has to be very reliable, and it has to be able to handle the relational representation of state-action pairs. In this paper we investigate the use of Gaussian processes to approximate the Q-values of state-action pairs. In order to employ Gaussian processes in a relational setting we propose graph kernels as a covariance function between state-action pairs. The standard prediction mechanism for Gaussian processes requires a matrix inversion which can become unstable when the kernel matrix has low rank. These instabilities can be avoided by employing QR-factorization. This leads to better and more stable performance of the algorithm and a more efficient incremental update mechanism. Experiments conducted in the blocks world and with the Tetris game show that Gaussian processes with graph kernels can compete with, and often improve on, regression trees and instance based regression as a generalization algorithm for RRL.

[1]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[2]  Frank Harary,et al.  Graph Theory , 2016 .

[3]  Stephen Barnett,et al.  Matrix Methods for Engineers and Scientists , 1982 .

[4]  Gene H. Golub,et al.  Matrix computations , 1983 .

[5]  C. Watkins Learning from delayed rewards , 1989 .

[6]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[7]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[8]  Alexander Gammerman,et al.  Ridge Regression Learning Algorithm in Dual Variables , 1998, ICML.

[9]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[10]  Stuart J. Russell,et al.  Bayesian Q-Learning , 1998, AAAI/IAAI.

[11]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[12]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[13]  Gunnar Rätsch,et al.  Engineering Support Vector Machine Kerneis That Recognize Translation Initialion Sites , 2000, German Conference on Bioinformatics.

[14]  W. Imrich,et al.  Product Graphs: Structure and Recognition , 2000 .

[15]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[16]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[17]  Leslie Pack Kaelbling,et al.  Practical Reinforcement Learning in Continuous Spaces , 2000, ICML.

[18]  Stefan Schaal,et al.  Real-time robot learning with locally weighted statistical learning , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[19]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[20]  Kurt Driessens,et al.  Speeding Up Relational Reinforcement Learning through the Use of an Incremental First Order Decision Tree Learner , 2001, ECML.

[21]  Bernhard Schölkopf,et al.  Some kernels for structured data , 2001 .

[22]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[23]  Luc De Raedt,et al.  Machine Learning: ECML 2001 , 2001, Lecture Notes in Computer Science.

[24]  Xin Wang,et al.  Batch Value Function Approximation via Support Vectors , 2001, NIPS.

[25]  Michael Collins,et al.  Convolution Kernels for Natural Language , 2001, NIPS.

[26]  Hisashi Kashima,et al.  Kernels for graph classification , 2002 .

[27]  Thore Graepel,et al.  PAC-Bayesian Pattern Classification with kernels , 2002 .

[28]  Saso Dzeroski,et al.  Integrating Experimentation and Guidance in Relational Reinforcement Learning , 2002, ICML.

[29]  George Karypis,et al.  Automated Approaches for Classifying Structures , 2002, BIOKDD.

[30]  Jeffrey M. Forbes,et al.  Representations for learning control policies , 2002 .

[31]  Thomas Gärtner,et al.  Kernels for structured data , 2008, Series in Machine Perception and Artificial Intelligence.

[32]  Tomaso Poggio,et al.  Everything old is new again: a fresh look at historical approaches in machine learning , 2002 .

[33]  Mehryar Mohri,et al.  Positive Definite Rational Kernels , 2003, COLT.

[34]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[35]  Thomas Gärtner,et al.  Graph kernels and Gaussian processes for relational reinforcement learning , 2006, Machine-mediated learning.

[36]  Carl E. Rasmussen,et al.  Gaussian Processes in Reinforcement Learning , 2003, NIPS.

[37]  Thomas Gärtner,et al.  A survey of kernels for structured data , 2003, SKDD.

[38]  Shie Mannor,et al.  Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.

[39]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[40]  Kurt Driessens,et al.  Relational Instance Based Regression for Relational Reinforcement Learning , 2003, ICML.

[41]  Peter A Flach,et al.  Proceedings of the 16th Annual Conference on Computational Learning Theory and 7th Kernel Workshop , 2003 .

[42]  Saso Dzeroski,et al.  Multi-relational data mining: an introduction , 2003, SKDD.

[43]  Thomas Gärtner,et al.  Cyclic pattern kernels for predictive graph mining , 2004, KDD.

[44]  Saso Dzeroski,et al.  Integrating Guidance into Relational Reinforcement Learning , 2004, Machine Learning.

[45]  Erik D. Demaine,et al.  Tetris is hard, even to approximate , 2002, Int. J. Comput. Geom. Appl..

[46]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[47]  Liming Xiang,et al.  Kernel-Based Reinforcement Learning , 2006, ICIC.

[48]  Jens Vygen,et al.  The Book Review Column1 , 2020, SIGACT News.

[49]  Iain Murray,et al.  Introduction To Gaussian Processes , 2008 .

[50]  De,et al.  Relational Reinforcement Learning , 2022 .