Cognitive Flexibility through Learning from Constraint Violations

Cognitive flexibility is an important goal in the computational modeling of higher cognition. An agent operating in the world that changes over time should adapt to the changes and update its knowledge according to them. In this paper, we report on the implementation of a constraint-based mechanism for learning from negative outcomes in well-established cognitive architecture, ICARUS. We discuss the challenges encountered during the implementation, describe how we solved them and provide an example of the integrated system’s operation. 1. Background and Rationale An important goal in the computational modeling of higher cognition is to invent techniques that enable computer programs to mimic the broad human functionality that we call adaptability, flexibility, or intelligence. Cognitive flexibility is a multidimensional construct. In this paper, we focus specifically on the ability of humans to act effectively and purposefully even when a familiar task environment is changing, thus rendering previously learned skills and strategies less effective or even obsolete. When the environment changes, the execution of previously acquired skills is likely to generate actions are that are inappropriate, incorrect or unhelpful vis-àvis the agent’s goal. A key component of flexible adaptation to the changing circumstances is therefore the ability to recover from and unlearn unsuccessful actions in the service of more effective future behavior (Ohlsson, 2010). This problem differs from the standard view of skill acquisition in two principled ways. First, instead of learning a new skill from scratch, the learning agent in this scenario needs to revise an existing skill or strategy. Second, whereas most work in computational modeling of skill acquisition has focused on how to make use of positive outcomes, the adaptation scenario requires mechanisms for learning from errors, mistakes and other types of negative feedback (Ohlsson, 2008). In past work, we developed a mechanism for learning from negative outcomes that is called constraint-based specialization (Ohlsson, 1993, 1996, 2007). This mechanism assumes that the agent has access to declarative knowledge in the form of constraints, where a constraint consists of an ordered pair with a relevance criterion and a satisfaction criterion, . Unlike propositions, constraints do not encode truths, but norms and prescriptions, e.g., traffic laws. A speed limit does not describe how fast drivers are going, but specifies the range within which their speeds ought to fall. Constraints support evaluation and judgment rather than deduction or explanation. In a constraintbased system, the architecture matches the relevance criteria of all constraints against the current state of its world in each cycle of operation. For constraints with matching relevance conditions, the satisfaction conditions are matched also. Satisfied constraints require no response, but constraint violations signal a failed expectation (due to a change in the world or to incomplete or erroneous knowledge); this is a learning opportunity. The purpose of the change triggered by a constraint violation is to revise the current skill or strategy in such a way as to avoid violating the same constraint in the future. The computational problem involved in unlearning an error is to specify exactly how to revise the relevant skill when an error is detected. The constraint-based specialization algorithm is a general solution to this problem (Ohlsson & Rees, 1991). The constraint-based specialization mechanism was previously implemented in HS, a production system architecture (Ohlsson, 1996). The HS system was limited along several dimensions. First, HS did not explicitly represent or take into account the hierarchical Proceedings of the 19th Conference on Behavior Representation in Modeling and Simulation, Charleston, SC, 21 24 March 2010