Endowing a Cognitive Architecture with World Knowledge

Endowing a Cognitive Architecture with World Knowledge Dario D. Salvucci (salvucci@drexel.edu) College of Computing & Informatics, Drexel University 3141 Chestnut St., Philadelphia, PA 19104 USA Abstract they include the cognitively plausible properties—such as the accessibility of knowledge elements—that some architectures rely on for modeling cognition (see Ball, Rodgers, & Gluck, 2004, for further discussion). More recent efforts to create knowledge bases for cognitive architectures (e.g., Douglass & Myers, 2010; Derbinsky, Laird, Smith, 2010; Emond, 2006) have explored the practical challenges inherent in such work, especially in understanding and reducing the computational demands of retrieving information from a large-scale database. This project uses the Wikipedia knowledge base to derive a declarative database for the ACT-R cognitive architecture (Anderson, 2007), complete both with tens of millions of world-knowledge facts and with estimates of the accessibility (activation) of these facts. In doing so, the project addresses theoretical challenges (e.g., an appropriate representation of these facts) and practical challenges (e.g., computational efficiency) in a way that generalizes to other cognitive architectures beyond ACT-R. Although computational models developed in cognitive architectures are often rich in their knowledge of procedural skills, they are often poor in their knowledge of declarative facts about the world. This work endows the ACT-R cognitive architecture with world knowledge derived from Wikipedia, compiling a knowledge base of over 37 million declarative facts that can be accessed by a cognitive model via standard memory retrievals. Estimates of the accessibility of these facts are also derived from Wikipedia text, allowing ACT-R to utilize the likelihood of knowing a fact and associations between related facts. Integration with a simple procedural model demonstrates how the knowledge base may serve not only to answer simple factual questions, but also to disambiguate among multiple possible meanings based on context. The resulting knowledge base can be queried quickly (typically well under one second) and is easily generalizable to other cognitive architectures. Keywords: Cognitive architectures; Wikipedia; ACT-R Introduction Cognitive architectures, particularly production-system architectures (e.g., Anderson, 2007; Laird, Newell, & Rosenbloom, 1987; Meyer & Kieras, 1997; Newell, 1990), have been used for a number of years as a computational framework for representing human cognition and behavior. Researchers have employed such architectures to model behavior in a large array of task domains. The vast majority of these models were developed with an emphasis on the procedural skills necessary to perform particular tasks; for instance, models have been developed to simulate behavior in the domains of piloting (Jones et al., 1999), game playing (Laird, 2002; Taatgen et al., 2003), and driving (Salvucci, 2006). At the same time, these models often have minimal declarative, factual knowledge; while they may include tens of facts to represent, say, the addition tables up to 9+9, they typically have little to no general knowledge about the world—for instance, what is the capital of Pennsylvania, or who invented the light bulb, or what sport is played by the Pittsburgh Steelers. This project aims to develop a large-scale knowledge base that can easily be integrated into cognitive architectures to provide models with general world knowledge. Although past efforts have created large-scale knowledge databases (e.g., Cyc: Lenat, 1994; Scone: Fahlman, 2006; WordNet: Miller, 1995), these databases do not necessarily integrate easily with a cognitive architecture: they cannot be accessed in a straightforward way from a production system, nor do Declarative Knowledge Base Wikipedia [http://www.wikipedia.org] is the largest open body of general knowledge on the Internet today, with over 4 million articles in English alone, written by thousands of active contributors. Both its breadth of topics and its open licensing makes Wikipedia extremely amenable to use as a knowledge base for cognitive modeling. Unfortunately, the primary content of Wikipedia comes in the body of its full- text articles, and until cognitive architectures have large- scale robust natural-language capabilities, they cannot make direct use of such articles. Fortunately, other aspects of the Wikipedia knowledge base are available in representations that more easily interface with modern architectures. Knowledge Content The primary content for this work comes from the DBpedia [http://www.dbpedia.org] project, which extracts and disseminates structured representations of Wikipedia knowledge. Specifically, DBpedia makes available several large datasets that served useful in building a knowledge base for cognitive architectures. The datasets, and the resulting knowledge arising from them, are described here. Relations. The first dataset includes information from Wikipedia “infoboxes” that appear alongside the full-text articles and provide knowledge in terms of relations. Table 1 shows the (partial) infobox for “Harrison Ford” as it

[1]  Richard L. Lewis,et al.  An Activation-Based Model of Sentence Processing as Skilled Memory Retrieval , 2005, Cogn. Sci..

[2]  John E. Laird,et al.  Performance evaluation of declarative memory systems in Soar , 2011 .

[3]  D E Kieras,et al.  A computational theory of executive cognitive processes and multiple-task performance: Part 1. Basic mechanisms. , 1997, Psychological review.

[4]  Nicholas L. Cassimatis,et al.  A Cognitive Substrate for Achieving Human-Level Intelligence , 2006, AI Mag..

[5]  Dario D. Salvucci Integration and Reuse in Cognitive Skill Acquisition , 2013, Cogn. Sci..

[6]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[7]  S. Douglass,et al.  Concurrent Knowledge Activation Calculation in Large Declarative Memories , 2010 .

[8]  John E. Laird,et al.  Towards efficiently supporting large symbolic declarative memories , 2010 .

[9]  A. Newell You can't play 20 questions with nature and win : projective comments on the papers of this symposium , 1973 .

[10]  Randolph M. Jones,et al.  Automated Intelligent Pilots for Combat Flight Simulation , 1998, AI Mag..

[11]  John E. Laird,et al.  Research in human-level AI using computer games , 2002, CACM.

[12]  N. Taatgen,et al.  How to construct a believable opponent using cognitive modeling in the game of set , 2003 .

[13]  Stuart M. Rodgers,et al.  Integrating ACT-R and Cyc in a large-scale model of language comprehension for use in intelligent agents , 2004 .

[14]  John R. Anderson,et al.  How Can the Human Mind Occur , 2007 .

[15]  Paul Bello,et al.  Developmental Accounts of Theory-of-Mind Acquisition: Achieving Clarity via Computational Cognitive Modeling , 2006 .

[16]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[17]  Richard Reviewer-Granger Unified Theories of Cognition , 1991, Journal of Cognitive Neuroscience.

[18]  John R. Anderson How Can the Human Mind Occur in the Physical Universe , 2007 .

[19]  Dario D. Salvucci Modeling Driver Behavior in a Cognitive Architecture , 2006, Hum. Factors.

[20]  John R. Anderson,et al.  A Theory of Sentence Memory as Part of A General Theory of Memory , 2001 .

[21]  Scott E. Fahlman,et al.  Marker-Passing Inference in the Scone Knowledge-Base System , 2006, KSEM.

[22]  Allen Newell,et al.  SOAR: An Architecture for General Intelligence , 1987, Artif. Intell..

[23]  Scott Douglass,et al.  Large Declarative Memories in ACT-R , 2009 .

[24]  N. Taatgen The nature and transfer of cognitive skills. , 2013, Psychological review.