论文信息 - Long-Term Symbolic Learning in Soar and ACT-R

Long-Term Symbolic Learning in Soar and ACT-R

Abstract The characteristics of long-term, symbolic learning were investigated using Soar and ACT-R models of a task to rearrange blocks into specific configurations. Long sequences of problems were run collecting data to answer fundamental questions about long-term, symbolic learning. The questions were whether symbolic learning continues indefinitely, how learned knowledge is used, and whether performance degrades over the long term. It was found that in both systems symbolic learning eventually stopped, ACT-R produced three observable phases of learning, and both Soar and ACT-R suffer from the utility problem of degraded performance with continuous on-line learning. Introduction Humans take years to develop the knowledge necessary for reasonably intelligent behavior. Building systems that achieve intelligent behavior is a major goal of the field of Artificial Intelligence (AI), but AI systems have not been run for the equivalent number of years. AI’s symbolic learning techniques are typically run only long enough to show performance improvements attributable to the new techniques (Kennedy & De Jong, 2003). In addition, there appears to be a fundamental limit on current AI’s symbolic learning techniques. When symbolic learners have been run on series of problems, performance has been found to eventually degrade. This behavior was named the “utility problem” (Minton 1990). On the other hand, humans do not suffer from the utility problem and their long-term learning is familiar to all of us. Understanding the nature and characteristics of learning over the long term in humans is important to achieving intelligent behavior from artificial systems. Explorations into long-term learning that address the utility problem in Soar have been done (Kennedy & De Jong, 2003). This paper reports on research comparing long-term, symbolic learning in Soar and ACT-R. Learning, or skill acquisition, has been proposed to include the move from problem solving to retrieval (Logan 1988) and to go through three stages (Anderson 1982; Fitts 1964). In the first stage, the cognitive stage, the knowledge is primarily declarative and must be interpreted. General problem solving techniques are employed, such as means-ends analysis, at this initial stage. In the second stage, the associate stage, there is a mix of declarative and procedural knowledge and the problem solving is transitioning from general methods to methods specific to the problem domain. By the third stage, called the autonomous stage, the knowledge is procedural: compiled, fast, and error-free. In this last stage, there is no problem solving and performance improvements are based on psychomotor speedup up to physical limitations (Anderson et al., 2004). The three stages of behavior have been observed in the complex Kanfer-Ackerman air traffic controller task and correlated with observed behavior on simpler tasks (Ackerman, 1998, 1990). The cognitive stage correlated with general intelligence, perceptual speed correlated with the associate stage, and psychomotor abilities correlated with the autonomous stage. Taatgen and Lee (2003) applied cognitive modeling to this complex task and demonstrated learning in the three stages in one cognitive model. By long-term symbolic learning (LTSL), we refer to the first two stages, i.e., up to the point where the system has reached steady-state behavior in a symbolic sense. Computational cognitive models of learning have been successfully built to simulate long-term learning, specifically lifetime learning of arithmetic (Lebiere 1999). AI learning systems typically have not been run long enough to achieve steady-state behavior (Kennedy & De Jong, 2003). One reason is that a performance problem was identified in AI systems with continuous learning by Minton (1990). Problem-solving time grew with the number of productions in the system. His system and others demonstrated degraded performance after 100 or fewer problems were solved. Further research suggested the problem was universal (Holder 1990). On the other hand, Markovitch and Scott (1988) reported discovering that their symbolic learner’s performance actually improved with forgetting. After their system had learned productions (macros) based on 5000 training problems and had established a level of performance in terms of a minimum number of nodes searched, as its learned productions were incrementally removed, its performance improved. Performance peaked when approximately 90 percent of the learned productions had been removed. This surprise finding and the pervasive utility problem indicate weaknesses in the implemented

J. Gregory Trafton | William G. Kennedy | J. Trafton | W. Kennedy