A selectionist approach to reinforcement.

We describe a principle of reinforcement that draws upon experimental analyses of both behavior and the neurosciences. Some of the implications of this principle for the interpretation of behavior are explored using computer simulations of adaptive neural networks. The simulations indicate that a single reinforcement principle, implemented in a biologically plausible neural network, is competent to produce as its cumulative product networks that can mediate a substantial number of the phenomena generated by respondent and operant contingencies. These include acquisition, extinction, reacquisition, conditioned reinforcement, and stimulus-control phenomena such as blocking and stimulus discrimination. The characteristics of the environment-behavior relations selected by the action of reinforcement on the connectivity of the network are consistent with behavior-analytic formulations: Operants are not elicited but, instead, the network permits them to be guided by the environment. Moreover, the guidance of behavior is context dependent, with the pathways activated by a stimulus determined in part by what other stimuli are acting on the network at that moment. In keeping with a selectionist approach to complexity, the cumulative effects of relatively simple reinforcement processes give promise of simulating the complex behavior of living organisms when acting upon adaptive neural networks.

[1]  B. Skinner The Generic Nature of the Concepts of Stimulus and Response , 1935 .

[2]  Kehoe Ej Connectionist models of conditioning: A tutorial. , 1989 .

[3]  J. Nevin Overall matching versus momentary maximizing: Nevin (1969) revisited. , 1979 .

[4]  B. F. Skinner,et al.  A second type of superstition in the pigeon. , 1957 .

[5]  I. Gormezano,et al.  Effects of cocaine on conditioning of the rabbit nictitating membrane response , 1991, Pharmacology Biochemistry and Behavior.

[6]  J. Long Elicitation and Reinforcement as Separate Stimulus Functions , 1966, Psychological reports.

[7]  R. Rescorla,et al.  Two-process learning theory: Relationships between Pavlovian conditioning and instrumental learning. , 1967, Psychological review.

[8]  J. Donahoe,et al.  Essentialism and selectionism in cognitive science and behavior analysis. , 1992, The American psychologist.

[9]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[10]  B. Skinner Superstition in the pigeon. , 1948, Journal of experimental psychology.

[11]  K. Breland,et al.  The misbehavior of organisms. , 1961 .

[12]  A. Catania Some Darwinian lessons for behavior analysis: A review of Bowler's The Eclipse of Darwinism1. , 1987 .

[13]  C. Heth Levels of aggregation and the generalized matching law. , 1992, Psychological review.

[14]  W. Singer,et al.  Selection of intrinsic horizontal connections in the visual cortex by correlated neuronal activity. , 1992, Science.

[15]  H. M. Hanson Effects of discrimination training on stimulus generalization. , 1959, Journal of experimental psychology.

[16]  A. Keller,et al.  Long-term potentiation in the motor cortex. , 1989, Science.

[17]  L. Squire Memory and the hippocampus: a synthesis from findings with rats, monkeys, and humans. , 1992, Psychological review.

[18]  T. Bliss,et al.  Long‐lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path , 1973, The Journal of physiology.

[19]  B. Skinner Two Types of Conditioned Reflex: A Reply to Konorski and Miller , 1937 .

[20]  M. Sidman,et al.  Conditional discrimination vs. matching to sample: an expansion of the testing paradigm. , 1982, Journal of the experimental analysis of behavior.

[21]  L. Swanson,et al.  The projections of the ventral tegmental area and adjacent regions: A combined fluorescent retrograde tracer and immunofluorescence study in the rat , 1982, Brain Research Bulletin.

[22]  J. Hinson,et al.  Hill-climbing by pigeons. , 1983, Journal of the experimental analysis of behavior.

[23]  E. Kehoe A layered network model of associative learning: learning to learn and configuration. , 1988, Psychological review.

[24]  B. Skinner Two Types of Conditioned Reflex and a Pseudo Type , 1935 .

[25]  K. Stickney,et al.  Attenuation of blocking by a change in US locus , 1983 .

[26]  Jer Staddon,et al.  Optimization: a result or a mechanism? , 1983 .

[27]  J. Panksepp,et al.  An incentive model of rewarding brain stimulation. , 1969, Psychological review.

[28]  D. R. Thomas,et al.  STIMULUS GENERALIZATION OF A POSITIVE CONDITIONED REINFORCER. II. EFFECTS OF DISCRIMINATION TRAINING. , 1964, Journal of experimental psychology.

[29]  W Vaughan,et al.  Melioration, matching, and maximization. , 1981, Journal of the experimental analysis of behavior.

[30]  David C. Palmer,et al.  THE INTERPRETATION OF COMPLEX HUMAN BEHAVIOR: SOME REACTIONS TO PARALLEL DISTRIBUTED PROCESSING, EDITED BY J. L. McCLELLAND, D. E. RUMELHART, AND THE PDP RESEARCH GROUP1 , 1989 .

[31]  D. R. Williams,et al.  Auto-maintenance in the pigeon: sustained pecking despite contingent non-reinforcement. , 1969, Journal of the experimental analysis of behavior.

[32]  B. Skinner,et al.  Giving up the ghost , 1981, Behavioral and Brain Sciences.

[33]  J. Donahoe,et al.  Paranoid schizophrenia may be caused by dopamine hyperactivity of CA1 hippocampus , 1992, Biological Psychiatry.

[34]  Enduring problems for molecular accounts of operant behavior. , 1990 .

[35]  C. B. Woodbury The learning of stimulus patterns by dogs. , 1943 .

[36]  R. Beninger The role of dopamine in locomotor activity and learning , 1983, Brain Research Reviews.

[37]  H. M. Jenkins,et al.  Blocking the development of stimulus control , 1970 .

[38]  L. Stein,et al.  Cellular investigations of behavioral reinforcement , 1989, Neuroscience & Biobehavioral Reviews.

[39]  R. Wise,et al.  A psychomotor stimulant theory of addiction. , 1987, Psychological review.

[40]  J. Dinsmoor A quantitative comparison of the discriminative and reinforcing functions of a stimulus. , 1950, Journal of experimental psychology.

[41]  I. Gormezano,et al.  EFFECT OF HALOPERIDOL (HAL) AND PIMOZIDE (PIM) ON PAVLOVIAN CONDITIONING OF THE RABBIT NICTITATING MEMBRANE RESPONSE , 1979 .

[42]  C. Shimp Optimal behavior in free-operant experiments. , 1969 .

[43]  J. Pear,et al.  The operant-respondent distinction: Future directions. , 1984, Journal of the experimental analysis of behavior.

[44]  W. H. Morse,et al.  Schedules using noxious stimuli. III. Responding maintained with response-produced electric shocks. , 1968, Journal of the experimental analysis of behavior.

[45]  A. Silberberg,et al.  The Structure of Choice , 1978 .

[46]  R. Herrnstein On the law of effect. , 1970, Journal of the experimental analysis of behavior.

[47]  J. Yeomans Two substrates for medial forebrain bundle self-stimulation: Myelinated axons and dopamine axons , 1989, Neuroscience & Biobehavioral Reviews.

[48]  R. Gehrz,et al.  The Formation of Stellar Systems from Interstellar Molecular Clouds , 1984, Science.