Grounding the Lexical Semantics of Verbs in Visual Perception using Force Dynamics and Event Logic

This paper presents an implemented system for recognizing the occurrence of events described by simple spatial-motion verbs in short image sequences. The semantics of these verbs is specified with event-logic expressions that describe changes in the state of force-dynamic relations between the participants of the event. An efficient finite representation is introduced for the infinite sets of intervals that occur when describing liquid and semi-liquid events. Additionally, an efficient procedure using this representation is presented for inferring occurrences of compound events, described with event-logic expressions, from occurrences of primitive events. Using force dynamics and event logic to specify the lexical semantics of events allows the system to be more robust than prior systems based on motion profile.

[1]  Daniel G. Bobrow,et al.  Natural Language Input for a Computer Problem Solving System , 1964 .

[2]  Z. Vendler Linguistics in Philosophy , 1967 .

[3]  Geoffrey Leech,et al.  Towards a semantic description of English , 1969 .

[4]  Manuel Blum,et al.  A Stability Test for Configurations of Blocks , 1970 .

[5]  Terry Winograd,et al.  Understanding natural language , 1974 .

[6]  Roger C. Schank,et al.  The fourteen primitive actions and their inferences. , 1973 .

[7]  Scott E. Fahlman,et al.  A Planning System for Robot Construction Tasks , 1973, Artif. Intell..

[8]  Norman I. Badler,et al.  Temporal scene analysis: conceptual descriptions of object movements. , 1975 .

[9]  Alan Bundy,et al.  MECHO, Year one , 1976, AISB.

[10]  Saburo Tsuji,et al.  Understanding a Simple Cartoon Film by a Computer Vision System , 1977, IJCAI.

[11]  M. R. Adler Computer Interpretation of PEANUTS Cartoons , 1977, IJCAI.

[12]  Alan Bundy,et al.  MECHO: A program to solve mechanics problems , 1979 .

[13]  Masahiko Yachida,et al.  Three Dimensional Movement Analysis of Dynamic Line Images , 1979, IJCAI.

[14]  Gordon S. Novak Jr. Computer Understanding Of Physics Problems Stated In Natural Language , 1979, ACL Microfiche Series 1-83, Including Computational Linguistics.

[15]  David R. Dowty,et al.  Word Meaning and Montague Grammar , 1979 .

[16]  Naoyuki Okada SUPP: Understanding Moving Picture Patterns Based on Linguistic Knowledge , 1979, IJCAI.

[17]  David L. Waltz,et al.  Visual Analog Representations for Natural Languages Understanding , 1979, IJCAI.

[18]  John McCarthy,et al.  Circumscription - A Form of Non-Monotonic Reasoning , 1980, Artif. Intell..

[19]  Brian V. Funt,et al.  Problem-Solving with Diagrammatic Representations , 1980, Artif. Intell..

[20]  M. Yachida,et al.  Tracking and segmentation of moving objects in dynamic line images , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  George F. Luger,et al.  Mathematical Model Building in the Solution of Mechanics Problems: Human Protocols and the MECHO Trace , 1981, Cogn. Sci..

[22]  David L. Waltz,et al.  A Knowledge-Based Approach to Language Processing : A Progress Report , 2002 .

[23]  Norihiro Abe,et al.  A Plot Understanding System on Reference to Both Image and Language , 1981, IJCAI.

[24]  Norihiro Abe,et al.  A learning of object structures by verbalism , 1982, COLING.

[25]  D. Marr,et al.  Representation and recognition of the movements of shapes , 1982, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[26]  Elizabeth S. Spelke Cognition in infancy , 1983 .

[27]  James F. Allen Maintaining knowledge about temporal intervals , 1983, CACM.

[28]  Gary Conrad Borchardt,et al.  A Computer Model for the Representation and Identification of Physical Events , 1984 .

[29]  Gary C. Borchardt,et al.  Event Calculus , 1985, IJCAI.

[30]  W. Richards,et al.  Boundaries of Visual Motion , 1985 .

[31]  Robert H. Thibadeau,et al.  Artificial Perception of Actions , 1986, Cogn. Sci..

[32]  Annette Herskovits,et al.  Language and spatial cognition , 1986 .

[33]  R. Baillargeon Representing the existence and the location of hidden objects: Object permanence in 6- and 8-month-old infants , 1986, Cognition.

[34]  Yoav Shoham,et al.  Temporal Logics in AI: Semantical and Ontological Considerations , 1987, Artif. Intell..

[35]  R. Baillargeon Object permanence in 3½- and 4½-month-old infants. , 1987 .

[36]  H. Furth Object permanence in five-month-old infants. , 1987, Cognition.

[37]  Leonard Talmy,et al.  Force Dynamics in Language and Cognition , 1987, Cogn. Sci..

[38]  H. Verkuyl Aspectual classes and aspectual composition , 1989 .

[39]  Martha Palmer Semantic Processing for Finite Domains , 1990 .

[40]  Gordon S. Novak,et al.  Understanding Natural Language with Diagrams , 1990, AAAI.

[41]  B. Landau,et al.  Spatial language and spatial cognition , 1991 .

[42]  M. R. Manzini Learnability and Cognition , 1991 .

[43]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[44]  Manfred Krifka,et al.  Thematic Relations as Links between Nominal Reference and Temporal Constitution , 1992 .

[45]  Terrance Philip Regier,et al.  The acquisition of lexical semantics for spatial terms: a connectionist model of perceptual categorization , 1992 .

[46]  Jeffrey Mark Siskind,et al.  Naive physics, event perception, lexical semantics, and language acquisition , 1992 .

[47]  Lawrence Birnbaum,et al.  Sensible Scenes: Visual Understanding of Complex Structures through Causal Analysis , 1993, AAAI.

[48]  Thad Starner,et al.  Visual Recognition of American Sign Language Using Hidden Markov Models. , 1995 .

[49]  Allan D. Jepson,et al.  Computational Perception of Scene Dynamics , 1996, ECCV.

[50]  Matthew Brand,et al.  Understanding manipulation in video , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[51]  Jeffrey Mark Siskind,et al.  A Maximum-Likelihood Approach to Visual Event Classification , 1996, ECCV.

[52]  Matthew Brand,et al.  The "Inverse Hollywood Problem": From Video to Scripts and Storyboards via Causal Analysis , 1997, AAAI/IAAI.

[53]  Alex Pentland,et al.  Coupled hidden Markov models for complex action recognition , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[54]  Matthew Brand,et al.  Physics-Based Visual Understanding , 1997, Comput. Vis. Image Underst..

[55]  Jeffrey Mark Siskind,et al.  Visual event perception , 1997 .

[56]  Aaron F. Bobick,et al.  Action recognition using probabilistic parsing , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[57]  Allan D. Jepson,et al.  Towards the computational perception of action , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[58]  Dan Roth,et al.  Relational Representations that Facilitate Learning , 1999, KR.

[59]  Hector Geffner,et al.  Learning Generalized Policies in Planning Using Concept Languages , 2000, KR.

[60]  Jeffrey Mark Siskind,et al.  Visual Event Classification via Force Dynamics , 2000, AAAI/IAAI.

[61]  Ren Object Permanence in 3 1/2- and 4 1/2-Month-Old Infants , 2001 .

[62]  Object Permanence , .