论文信息 - Reinforcement Learning as a Context for Integrating AI Research

Reinforcement Learning as a Context for Integrating AI Research

As Baum argues, reinforcement learning is essential to intelligence (Baum 2004). It enabled humans who evolved in the tropics to satisfy their needs for food and warmth in the arctic. Well known reinforcement learning algorithms have been identified in the neural behaviors of mammal brains (Brown, Bullock and Grossberg 1999; Seymour et al. 2004). A brain senses and acts in the world, and learns behaviors reinforced by values for distinguishing good and bad outcomes. The brain learns a simulation model for tracing cause and effect relations between behaviors and outcomes – that is, for solving the credit assignment problem (Sutton and Barto 1998). Reason and high level representations of sense information are part of this simulation model, and language is a representation of the model for exchange with other brains (language may also serve internally to increase the efficiency of the brain's simulation model). Thus the simulation model and its role in reinforcement learning provide a context for integrating different AI subfields. Brains can learn simulation models as internal behaviors that predict sense information, reinforced by predictive accuracy. For example, learning to predict short-term changes to visual information based on body motion may be the basis for learning 3-D allocentric representations of vision. Learning to predict longer-term changes to visual information, sometimes in response to the brain's motor behaviors, may be the basis for learning to partition the visual field into objects, learning to classify those objects, and learning to model behaviors of object classes. Thus processes reinforced by predictive accuracy may learn a simulation model, useful to other processes that learn to satisfy basic physical needs. This suggests a brain design partitioned into interacting learning processes, each defined by a set of inputs (some sensory, some from other brain processes), an internal representation, a set of outputs (some motor, some to other brain processes), and a reinforcement value. This is similar to Minsky's notion of a brain comprising a society of agents, each implementing a different way to think (Minsky, Singh and Sloman 2004). Brain processes may interact in a variety of ways. Processes reinforced by short-term predictive accuracy may produce useful input to processes making longer-term predictions. Predictive processes may help trace cause and effect relations between behaviors and rewards in other processes. Representations learned by one process may play a role in the reinforcement values of other processes. As Baum observes, the immense evolutionary computation that learned …

Bill Hibbard | B. Hibbard

[1] Peter Dayan,et al. Temporal difference models describe higher-order learning in humans , 2004, Nature.

[2] Z. Harris,et al. Foundations of Language , 1940 .

[3] Eric B. Baum,et al. What is thought? , 2003 .

[4] Joshua W. Brown,et al. How the Basal Ganglia Use Parallel Excitatory and Inhibitory Learning Pathways to Selectively Respond to Unexpected Rewarding Cues , 1999, The Journal of Neuroscience.

[5] Anders Sandberg. Bayesian attractor neural network models of memory , 2003 .

[6] C. Koch. The quest for consciousness : a neurobiological approach , 2004 .

[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[8] Aaron Sloman,et al. The St. Thomas Common Sense Symposium: Designing Architectures for Human-Level Intelligence , 2004, AI Mag..

[9] Bill Hibbard,et al. Super-intelligent machines , 2012, COMG.

[10] Z. Harris,et al. Foundations of language , 1941 .