Inductive Biases in a Reinforcement Learner

Abstract : Reinforcement Learning Methods (RLMs) typically select candidate solutions stochastically based on a credibility space of hypotheses which the RLM maintains, either implicitly or explicitly. RLMs typically have both inductive and deductive aspects: they inductively improve their credibility space on a stage-by stage basis; they deductively select an appropriate response to incoming stimuli using their credibility space. In this sense, RLMs share some learning attributes in common with active, incremental concept learners. Unlike some concept learners that employ deterministic procedures for selecting hypotheses, however, the evaluations of hypotheses provided to RLMs are often uncertain, either due to noisy environments, or due to summary evaluations which occur after a sequence of learner environment interactions.

[1]  Michael Satosi Watanabe,et al.  Information-Theoretical Aspects of Inductive and Deductive Inference , 1960, IBM J. Res. Dev..

[2]  Satosi Watanabe,et al.  PATTERN RECOGNITION AS INFORMATION COMPRESSION , 1972 .

[3]  K. Dejong,et al.  An analysis of the behavior of a class of genetic adaptive systems , 1975 .

[4]  Satosi Watanabe Creative Learning and Propensity Automaton , 1975, IEEE Transactions on Systems, Man, and Cybernetics.

[5]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[6]  Tom M. Mitchell,et al.  Version Spaces: A Candidate Elimination Approach to Rule Learning , 1977, IJCAI.

[7]  Richard S. Sutton,et al.  Temporal credit assignment in reinforcement learning , 1984 .

[8]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[9]  Paul E. Utgoff,et al.  Shift of bias for inductive concept learning , 1984 .

[10]  David Haussler Bias, Version Spaces and Valiant's Learning Framework , 1987 .

[11]  David Haussler,et al.  Quantifying Inductive Bias: AI Learning Algorithms and Valiant's Learning Framework , 1988, Artif. Intell..

[12]  Shaul Markovitch,et al.  Learning Novel Domains Through Curiosity and Conjecture , 1989, IJCAI.

[13]  Kumpati S. Narendra,et al.  Learning automata - an introduction , 1989 .

[14]  Diana Faye Gordon Active bias adjustment for incremental, supervised concept learning , 1990 .

[15]  Dana H. Ballard,et al.  Active Perception and Reinforcement Learning , 1990, Neural Computation.

[16]  William M. Spears,et al.  Is Consistency Harmful? , 1992 .

[17]  Peter Bock,et al.  The emergence of artificial cognition - an introduction to collective learning , 1993 .