Coping with Static and Dynamic Spatial Relations

The interplay between visual perception and natural language in humanmachine-interaction receives growing attention since it constitutes a prominent issue in many potential application areas. The aim of language-oriented AI research in this context is to achieve an operational form of referential semantics that reaches down to the sensoric level. In this contribution, we will focus on the evaluation of spatial relations which form the basis for spatial reference expressions in natural language. The algorithmic characterization of static as well as dynamic spatial relations between objects in a time-varying scene will be considered. The paper will present new results from the project VITRA (Visual Translator), where a natural language access to visual data has been investigated in di erent domains of application. R esum e La connexion entre la perception visuelle et le langage dans la communication homme-machine est de plus en plus d'actualit e parce que constituant un probleme majeur dans plusieurs domaines d'application. Dans ce contexte, le but de la recherche sur le traitement automatique du langage naturel est d'obtenir une forme op erationnelle de la s emantique r eferentielle, allant jusqu'au niveau sensoriel. Dans cette contribution, nous analysons l' evaluation des relations spatiales qui forment la base pour les expressions de r ef erence spatiale dans le langage naturel. Ensuite, nous consid erons la caract erisation algorithmique des relations spatiales statiques et dynamiques entre les objets dans une s equence d'images. Nous pr esentons ces nouveaux r esultats dans le contexte du projet VITRA (Visual Translator), dans lequel le probl eme de l'acc es en langage naturel a des donn ees visuelles a et e etudi e dans di erents domaines d'application. In: Proc. of the 5th Int. Workshop Time, Space and Movement, TSM'95

[1]  Ruzena Bajcsy,et al.  LandScan: A Natural Language and Computer Vision System for Analyzing Aerial Images , 1985, IJCAI.

[2]  Jörg R. J. Schirra,et al.  ANTLIMA - A Listener Model with Mental Images , 1993, IJCAI.

[3]  Peter Schefe,et al.  The Design of SWYSS, a Dialogue System for Scene Analysis , 1984 .

[4]  Gudula Retz-Schmidt,et al.  Various Views on Spatial Prepositions , 1988, AI Mag..

[5]  Tim Lüth,et al.  Utilizing Spatial Relations for Natural Language Access to an Autonomous Mobile Agent , 1994, KI.

[6]  Karl Rohr,et al.  Integrating Vision and Language: Towards Automatic Description of Human Movements , 1995, KI.

[7]  Claude Vandeloise,et al.  L'espace en français : sémantique des prépositions spatiales , 1988 .

[8]  W. Maab,et al.  Vitra guide: multimodal route descriptions for computer assisted vehicle navigation , 1993 .

[9]  B. Landau,et al.  “What” and “where” in spatial language and spatial cognition , 1993 .

[10]  Leonard Talmy,et al.  How Language Structures Space , 1983 .

[11]  Jörg R. J. Schirra,et al.  From image sequences to natural language: a First step toward automatic perception and description of motions , 1987, Appl. Artif. Intell..

[12]  Simone Pribbenow,et al.  Computing the meaning of localization expressions involving prepositions: The role of concepts and spatial context , 1993 .

[13]  Thomas Rist,et al.  Natural Language Access to Visual Data: Dealing with Space and Movement , 1989 .

[14]  David N. Chin,et al.  Understanding Location Descriptions in the LEI System , 1994, ANLP.

[15]  Michael Zock,et al.  What do we mean when we say to the left or to the right?: how to learn about space by building and exploring a microworld , 1995 .

[16]  Yong Cao,et al.  Interactive Graphics Design with Situated Agents , 1993, Graphics and Robotics.

[17]  George Lakoff,et al.  Hedges: A study in meaning criteria and the logic of fuzzy concepts , 1973, J. Philos. Log..

[18]  Wolfgang Wahlster,et al.  Incremental Natural Language Description of Dynamic Imagery , 1989, Wissensbasierte Systeme.

[19]  Thomas Rist,et al.  Coping with the Intrinsic and Deictic Uses of Spatial Prepositions , 1986, AIMSA.

[20]  Wolfgang Wahlster,et al.  Glancing, Referring and Explaining in the Dialogue System HAM-RPM , 1978, CL.

[21]  Wolfgang Wahlster,et al.  One word says more than a thousand pictures , 1989 .

[22]  W. Wahister One word says more than a thousand pictures: on the automatic verbalization of the results of image sequence analysis system , 1987 .

[23]  Wolfgang Wahlster,et al.  Over-Answering Yes-No Questions: Extended Responses in a NL Interface to a Vision System , 1983, IJCAI.

[24]  M. Aurnague,et al.  A three-level approach to the semantics of space , 1993 .

[25]  Annette Herskovits,et al.  Semantics and Pragmatics of Locative Expressions , 1985, Cogn. Sci..

[26]  Patrick Olivier,et al.  Automatic Depiction of Spatial Descriptions , 1994, AAAI.

[27]  Jörg R. J. Schirra,et al.  Optional Deep Case Filling and Focus Control with Mental Images: ANTLIMA-KOREF , 1995, IJCAI.

[28]  J. Davenport Editor , 1960 .

[29]  G. Miller,et al.  Language and Perception , 1976 .

[30]  Thomas Rist,et al.  Characterizing Trajectories of Moving Objects Using Natural Language Path Descriptions , 2003 .

[31]  Ingrid Kaufmann Semantic and conceptual aspects of the preposition durch , 1993 .

[32]  Paul McKevitt,et al.  Integration of Natural Language and Vision Processing , 1996, Springer Netherlands.

[33]  Bonnie Webber,et al.  Animation from instructions , 1991 .