Constructive reinforcement learning

This paper presents an operative measure of reinforcement for constructive learning methods, i.e., eager learning methods using highly expressible (or universal) representation languages. These evaluation tools allow a further insight in the study of the growth of knowledge, theory revision and abduction. The final approach is based on an apportionment of credit wrt. the ‘course’ that the evidence makes through the learnt theory. Our measure of reinforcement is shown to be justified by cross-validation and by the connection with other successful evaluation criteria, li ke the MDL principle. Finally, the relation with the classical view of reinforcement is studied, where the actions of an intelli gent system can be rewarded or penalised, and we discuss whether this should affect our distribution of reinforcement. The most important result of this paper is that the way we distribute reinforcement into knowledge results in a rated ontology, instead of a single prior distribution. Therefore, this detailed information can be exploited for guiding the space search of inductive learning algorithms. Likewise, knowledge revision may be done to the part of the theory which is not justified by the evidence. © XXXX John Wiley & Sons, Ltd.

[1]  Craig Boutilier,et al.  Abduction as Belief Revision , 1995, Artif. Intell..

[2]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 1997, Texts in Computer Science.

[3]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[4]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[5]  José Hernández-Orallo,et al.  Distinguishing Abduction and Induction under Intensional Complexity , 1998 .

[6]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[7]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[8]  J. Hernández-Orallo,et al.  Inductive Inference of Functional Logic Programs by Inverse Narrowing ∗ , 1998 .

[9]  Ming Li,et al.  On Prediction by Data Compression , 1997, ECML.

[10]  Paul Thagard,et al.  The Best Explanation: Criteria for Theory Choice , 1978 .

[11]  A. P. van den Bosch Simplicity and Prediction , 1994 .

[12]  Stephen Muggleton,et al.  A Learnability Model for Universal Representations and Its Application to Top-down Induction of Decision Trees , 1995, Machine Intelligence 15.

[13]  A. Karmiloff-Smith Précis of Beyond modularity: A developmental perspective on cognitive science , 1994, Behavioral and Brain Sciences.

[14]  Temple F. Smith Occam's razor , 1980, Nature.

[15]  Robert A. Kowalski,et al.  Reconciling the Event Calculus With the Situation Calculus , 1997, J. Log. Program..

[16]  Peter Grünwald,et al.  The Minimum Description Length Principle and Non - Deductive Inference , 1997 .

[17]  C. Hempel,et al.  Aspects of Scientific Explanation and Other Essays in the Philosophy of Science. , 1966 .

[18]  Peter A. Flach,et al.  Abduction and induction: syllogistic and inferential perspectives , 1996 .

[19]  Robert H. Ennis Enumerative Induction and Best Explanation , 1968 .

[20]  Ehud Shapiro,et al.  Inductive Inference of Theories from Facts , 1991, Computational Logic - Essays in Honor of Alan Robinson.

[21]  Atocha Aliseda,et al.  A Unified Framework for Abductive and Inductive Reasoning in Philosophy and AI , 1996 .

[22]  K. Popper,et al.  Conjectures and refutations;: The growth of scientific knowledge , 1972 .

[23]  Robert Levinson,et al.  GENERAL GAME‐PLAYING AND REINFORCEMENT LEARNING , 1995, Comput. Intell..

[24]  John H. Holland,et al.  Induction: Processes of Inference, Learning, and Discovery , 1987, IEEE Expert.

[25]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[26]  Murray Shanahan,et al.  Explanation in the Situation Calculus , 1993, IJCAI.

[27]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[28]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[29]  Raymond J. Mooney,et al.  Integrating Abduction and Induction in Machine Learning , 2000 .

[30]  Yemima Ben-menahem The inference to the best explanation , 1990 .

[31]  William Whewell,et al.  The philosophy of the inductive sciences , 1847 .

[32]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[33]  Dean Allemang,et al.  The Computational Complexity of Abduction , 1991, Artif. Intell..

[34]  M. Resnik,et al.  Aspects of Scientific Explanation. , 1966 .

[35]  Ramón López de Mántaras,et al.  Machine Learning from Examples: Inductive and Lazy Methods , 1998, Data Knowl. Eng..

[36]  Jorma Rissanen,et al.  Fisher information and stochastic complexity , 1996, IEEE Trans. Inf. Theory.

[37]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[38]  Luc Devroye,et al.  Distribution-free performance bounds for potential function rules , 1979, IEEE Trans. Inf. Theory.

[39]  Peter A. Flach,et al.  Abduction and induction: essays on their relation and integration , 2000 .

[40]  Thomas G. Dietterich,et al.  Explanation-Based Learning and Reinforcement Learning: A Unified View , 1995, Machine-mediated learning.