Computational simulation and the search for a quantitative description of simple reinforcement schedules

We aim to discuss schedules of reinforcement in its theoretical and practical terms pointing to practical limitations on implementing those schedules while discussing the advantages of computational simulation. In this paper, we present a R script named Beak, built to simulate rates of behavior interacting with schedules of reinforcement. Using Beak, we’ve simulated data that allows an assessment of different reinforcement feedback functions (RFF). This was made with unparalleled precision, since simulations provide huge samples of data and, more importantly, simulated behavior isn’t changed by the reinforcement it produces. Therefore, we can vary it systematically. We’ve compared different RFF for RI schedules, using as criteria: meaning, precision, parsimony and generality. Our results indicate that the best feedback function for the RI schedule was published by Baum (1981). We also propose that the model used by Killeen (1975) is a viable feedback function for the RDRL schedule. We argue that Beak paves the way for greater understanding of schedules of reinforcement, addressing still open questions about quantitative features of schedules. Also, they could guide future experiments that use schedules as theoretical and methodological tools. [Simple schedules] 5 Schedules of reinforcement are core concepts for the experimental analysis of behavior. The algorithms and rules that define schedules, however, are usually taken for granted, except for initial works (e.g., Catania & Reynolds, 1968; Ferster & Skinner, 1957; Fleshler & Hoffman, 1962; Millenson, 1963). The absence of schedule appraisal in the current literature is a potential problem. Pioneering methods to study schedule parameters were restricted to the limits of state-of-the-art technology at that time. In fact, current operant chambers, although controlled by modern computers, are still based on algorithms strongly tied to the primordial electromechanical devices, which can be regarded as a waste of resources. More recent technologies pave the way for a precise quantitative description of schedules. Such a quantitative analysis would directly address some old yet still pending questions about schedules of reinforcement (e.g., Baum, 1973, 1993; Catania & Reynolds, 1968; Rachlin, 1978; Killeen, 1975) and guide future research that uses schedules as a methodological tool. In this work, we aim to resume the long-dormant discussion about quantitative features of simple schedules. For this purpose, we present a computational routine called Beak, built to simulate rates of behavior interacting with schedules of reinforcement. Our major contribution is that this software allows us to test insurmountable possibilities of rates of responses without having to rely on extensive experimentation with actual subjects. The absence of discussions addressing the schedule’s algorithms used along many experiments suggests an apparent, but false, consensus. There are several critical aspects to defining and implementing schedules of reinforcement, which were already recognized by Ferster & Skinner in their seminal work. According to these authors, every schedule of reinforcement could be “represented by a certain arrangement of timers, counters and relay circuits” (Ferster & Skinner, 1957, p. 10). Still, most textbooks and technical papers omit relevant details about schedule algorithms [Simple schedules] 6 and emphasize the behavioral patterns associated with each simple schedule (e.g., Catania & Reynolds, 1968; Mazur, 2016; Pierce & Cheney, 2017). This discussion, however, is not confined to solely theoretical matters. Schedules of reinforcement are held as crucial methodological tools for behavioral scientists to analyze many experimental results. The correct interpretation of these results relies on clarity of schedule definitions when applied to problems, such as discrimination learning by the use of multiple schedules (Ferster & Skinner, 1957; Weiss & Ost, 1974), observing behavior and conditioned reinforcement (Wyckoff, 1969), choice (Herrnstein, 1961, 1970) by the use of concurrent schedules, self-control (Rachlin & Green, 1972) by the use of concurrent chained schedules, behavior pharmacology (Dews, 1962; Reilly, 2003), decision making and bias (Fantino, 1998; Goodie & Fantino, 1995). It is important to emphasize that these simulations do not replace the study of animal behavior. Simulations are concerned with mapping of an entire schedule, going through a large range of possible response rates and exhaustively repeating these conditions. In this sense, Beak can provide orientation for a researcher in creating an experimental scenario to which a biological being can be purposefully subjected. Since this biological being will behave with certain response rate, its confrontation with the simulation predictions may clarify biases and constraints of actual behavior. In other words, simulations map the normative rules of schedules while experiments map effective behaviors of organisms.

[1]  W. Baum,et al.  Performances on ratio and interval schedules of reinforcement: Data and theory. , 1993, Journal of the experimental analysis of behavior.

[2]  Wyckoff Lb The role of observing responses in discrimination learning. Part I. , 1952 .

[3]  W. K. Honig,et al.  Handbook of Operant Behavior , 2022 .

[4]  C. Kelly,et al.  Validity of self-reported height and weight for estimating prevalence of overweight among Estonian adolescents: the Health Behaviour in School-aged Children study , 2015, BMC Research Notes.

[5]  W M Baum,et al.  Optimization and the matching law as accounts of instrumental behavior. , 1981, Journal of the experimental analysis of behavior.

[6]  Edmund Fantino,et al.  An Experientially Derived Base-Rate Error in Humans , 1995 .

[7]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[8]  M. Sidman,et al.  Fixed-interval and fixed-ratio reinforcement schedules with human subjects , 1988, The Analysis of verbal behavior.

[9]  R J HERRNSTEIN,et al.  Relative and absolute strength of response as a function of frequency of reinforcement. , 1961, Journal of the experimental analysis of behavior.

[10]  H Rachlin,et al.  Matching and maximizing with concurrent ratio-interval schedules. , 1983, Journal of the experimental analysis of behavior.

[11]  Stephen Ambler,et al.  A mathematical model of learning under schedules of interresponse time reinforcement , 1973 .

[12]  E Fantino,et al.  Behavior analysis and decision making. , 1998, Journal of the experimental analysis of behavior.

[13]  H S HOFFMAN,et al.  A progression for generating variable-interval schedules. , 1962, Journal of the experimental analysis of behavior.

[14]  P. Chance Learning and Behavior , 1979 .

[15]  K. Lattal,et al.  An experimental analysis of the extinction-induced response burst. , 2020, Journal of the experimental analysis of behavior.

[16]  D. Elliffe,et al.  The natural mathematics of behavior analysis. , 2018, Journal of the experimental analysis of behavior.

[17]  R. Herrnstein On the law of effect. , 1970, Journal of the experimental analysis of behavior.

[18]  M. Reilly Extending mathematical principles of reinforcement into the domain of behavioral pharmacology , 2003, Behavioural Processes.

[19]  A. Machado Learning the temporal dynamics of behavior. , 1997, Psychological review.

[20]  Drazen Prelec,et al.  Matching, maximizing, and the hyperbolic reinforcement feedback function. , 1982 .

[21]  P. Killeen On the temporal control of behavior. , 1975 .

[22]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[23]  Rob J Hyndman,et al.  Computing and Graphing Highest Density Regions , 1996 .

[24]  W. Pierce,et al.  Behavior Analysis and Learning: A Biobehavioral Approach , 2017 .

[25]  G. S. Reynolds,et al.  A quantitative analysis of the responding maintained by interval schedules of reinforcement. , 1968, Journal of the experimental analysis of behavior.

[26]  J. R. Millenson,et al.  Random interval schedules of reinforcement. , 1963, Journal of the experimental analysis of behavior.

[27]  H. Rachlin Judgment, Decision, and Choice: A Cognitive/Behavioral Synthesis , 1989 .

[28]  L. Wyckoff The role of observing responses in discrimination learning. , 1952, Psychological review.

[29]  H. Rachlin A molar theory of reinforcement schedules. , 1978, Journal of the experimental analysis of behavior.

[30]  H Rachlin,et al.  Commitment, choice and self-control. , 1972, Journal of the experimental analysis of behavior.

[31]  M Galizio,et al.  Laboratory Lore and Research Practices in the Experimental Analysis of Human Behavior: Selecting Reinforcers and Arranging Contingencies , 1988, The Behavior analyst.

[32]  S. J. Weiss,et al.  Response discriminative and reinforcement factors in stimulus control of performance on multiple and chained schedules of reinforcement , 1974 .

[33]  B. Greer,et al.  Minimizing resurgence of destructive behavior using behavioral momentum theory , 2018, Journal of applied behavior analysis.

[34]  W M Baum,et al.  The correlation-based law of effect. , 1973, Journal of the experimental analysis of behavior.

[35]  W M Baum,et al.  In search of the feedback function for variable-interval schedules. , 1992, Journal of the experimental analysis of behavior.