The affordance-based concept

Natural language use relies on situational context. The meaning of words and utterances depend on the physical environment and the goals and plans of communication partners. These facts should be central to theories of language and automatic language understanding systems. Instead, they are often ignored, leading to partial theories and systems that cannot fully interpret linguistic meaning. I introduce a new computational theory of conceptual structure that has as its core claim that concepts are neither internal nor external to the language user, but instead span the objective-subjective boundary. This theory proposes interaction and prediction as a central theme, rather than solely emphasizing deducing, sensing or acting. To capture the possible interactions between subject and object, the theory relies on the notion of perceived affordances : structured units of interaction that can be used for prediction at certain levels of abstraction. By using perceived affordances as a basis for language understanding, the theory accounts for many aspects of the situated nature of human language use. It provides a unified solution to a number of other demands on a theory of language understanding including conceptual combination, prototypicality effects, and the generative nature of lexical items. To support the theory, I describe an implementation that relies on probabilistic hierarchical plan recognition to predict possible interactions. The elements of a recognized plan provide an instance of perceived affordances which are used by a linguistic parser to ground the meaning of words and grammatical constituents. Evaluations performed in a multiuser role playing game environment show that this implementation captures the meaning of free-form spontaneous directive speech acts that cannot be understood without taking into account the intentional and physical situation of speaker and listener. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

[1]  William Schuler,et al.  Using Model-Theoretic Semantic Interpretation to Guide Statistical Parsing and Word Recognition in a Spoken Language Interface , 2003, ACL.

[2]  Stevan Harnad,et al.  Symbol grounding problem , 1990, Scholarpedia.

[3]  David Chapman,et al.  Vision, instruction, and action , 1990 .

[4]  Deb Roy,et al.  Speaking with your Sidekick: Understanding Situated Speech in Computer Role Playing Games , 2005, AIIDE.

[5]  Daniel M. Dubois,et al.  Computing Anticipatory Systems , 1998 .

[6]  Deb Roy,et al.  Coupling perception and simulation: steps towards conversational robotics , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[7]  Ian H. Witten,et al.  The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.

[8]  R. Goldman,et al.  Partial Observability and Probabilistic Plan/Goal Recognition , 2005 .

[9]  R. Jackendoff Foundations of Language: Brain, Meaning, Grammar, Evolution , 2002 .

[10]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[11]  Ruth Garrett Millikan,et al.  White Queen Psychology and Other Essays for Alice. , 1984 .

[12]  Marvin Minsky,et al.  Em-one: an architecture for reflective commonsense thinking , 2005 .

[13]  Andreas Stolcke,et al.  Finding consensus among words: lattice-based word error minimization , 1999, EUROSPEECH.

[14]  James F. Allen,et al.  A Plan Recognition Model for Clarification Subdialogues , 1984, ACL.

[15]  M. Martin White Queen Psychology and Other Essays for Alice , 1995 .

[16]  Elsie Fogerty Speech , 1933, Encyclopedia of Evolutionary Psychological Science.

[17]  Salvatore Valenti,et al.  An Overview of Current Research on Automated Essay Grading , 2003, J. Inf. Technol. Educ..

[18]  D. Dennett The Intentional Stance. , 1987 .

[19]  Mark Steedman,et al.  Combinators and Grammars , 1988 .

[20]  Luke S. Zettlemoyer,et al.  Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[21]  Deb Roy,et al.  Mental imagery for a conversational robot , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[22]  Deb Roy,et al.  Grounded spoken language acquisition: experiments in word learning , 2003, IEEE Trans. Multim..

[23]  R. A. Brooks,et al.  Intelligence without Representation , 1991, Artif. Intell..

[24]  Chris D. Paice,et al.  Constructing literature abstracts by computer: Techniques and prospects , 1990, Inf. Process. Manag..

[25]  Andreas Stolcke,et al.  An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities , 1994, CL.

[26]  Brian Cantwell Smith,et al.  On the origin of objects , 1997, Trends in Cognitive Sciences.

[27]  Deb Roy,et al.  Semiotic schemas: A framework for grounding language in action and perception , 2005, Artif. Intell..

[28]  Alexander Stoytchev,et al.  Behavior-Grounded Representation of Tool Affordances , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[29]  S. Laurence,et al.  Concepts and Cognitive Science , 1999 .

[30]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[31]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[32]  Craig Boutilier,et al.  Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[33]  Mark H. Bickhard,et al.  Anticipation and Representation , 2019, Handbook of Anticipation.

[34]  John Hoeks,et al.  Proceedings of the 26th annual meeting of the Cognitive Science Society , 2005 .

[35]  J. Feldman,et al.  Karma: knowledge-based active representations for metaphor and aspect , 1997 .

[36]  Nicholas Haddock,et al.  Computational models of incremental semantic interpretation , 1989 .

[37]  Herbert Gish,et al.  Speech recognition in multiple languages and domains: the 2003 BBN/LIMSI EARS system , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[38]  James A. Hendler,et al.  HTN Planning: Complexity and Expressivity , 1994, AAAI.

[39]  C. Raymond Perrault,et al.  Analyzing Intention in Utterances , 1986, Artif. Intell..

[40]  Jordan B. Peterson The Meaning of Meaning , 2007 .

[41]  Jeffrey Mark Siskind,et al.  Grounding the Lexical Semantics of Verbs in Visual Perception using Force Dynamics and Event Logic , 1999, J. Artif. Intell. Res..

[42]  Michael P. Wellman,et al.  Probabilistic State-Dependent Grammars for Plan Recognition , 2000, UAI.

[43]  C. K. Ogden,et al.  The Meaning of Meaning , 1923 .

[44]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[45]  Martha E. Pollack,et al.  A Model of Plan Inference That Distinguishes Between the Beliefs of Actors and Observers , 1986, ACL.

[46]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[47]  Roger K. Moore Computer Speech and Language , 1986 .

[48]  R. Lathe Phd by thesis , 1988, Nature.

[49]  N. Cocchiarella,et al.  Situations and Attitudes. , 1986 .

[50]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[51]  Jerome A. Feldman,et al.  When push comes to shove: a computational model of the role of motor control in the acquisition of action verbs , 1997 .

[52]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[53]  Colin Potts,et al.  Design of Everyday Things , 1988 .

[54]  John R. Searle,et al.  Minds, brains, and programs , 1980, Behavioral and Brain Sciences.

[55]  James Pustejovsky,et al.  The Generative Lexicon , 1995, CL.

[56]  Deb Roy,et al.  Probabilistic grounding of situated speech using plan recognition and reference resolution , 2005, ICMI '05.

[57]  J. Prinz Furnishing the Mind: Concepts and Their Perceptual Basis , 2004 .

[58]  Dana S. Nau,et al.  SHOP2: An HTN Planning System , 2003, J. Artif. Intell. Res..

[59]  Matthew Stone,et al.  Representing Communicative Intentions in Collaborative Conversational Agents , 2001 .

[60]  Dilek Z. Hakkani-Tür,et al.  A general algorithm for word graph matrix decomposition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[61]  Deb Roy,et al.  Grounded Semantic Composition for Visual Scenes , 2011, J. Artif. Intell. Res..

[62]  Richard S. Sutton,et al.  Predictive Representations of State , 2001, NIPS.

[63]  I. H. Fichte,et al.  Zeitschrift für Philosophie und philosophische Kritik , 2022 .

[64]  Svetha Venkatesh,et al.  Policy Recognition in the Abstract Hidden Markov Model , 2002, J. Artif. Intell. Res..

[65]  Julie C. Sedivy,et al.  Achieving incremental semantic interpretation through contextual representation , 1999, Cognition.

[66]  Deb Roy,et al.  A trainable spoken language understanding system for visual object selection , 2002, INTERSPEECH.

[67]  G. Reeke Marvin Minsky, The Society of Mind , 1991, Artif. Intell..

[68]  Terry Winograd,et al.  Procedures As A Representation For Data In A Computer Program For Understanding Natural Language , 1971 .

[69]  Verzekeren Naar Sparen,et al.  Cambridge , 1969, Humphrey Burton: In My Own Time.

[70]  Aaron F. Bobick,et al.  Action recognition using probabilistic parsing , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).