The POETICON Corpus: Capturing Language Use and Sensorimotor Experience in Everyday Interaction

Natural language use, acquisition, and understanding takes place usually in multisensory and multimedia communication environments. Therefore, for one to model language in its interaction and integration with sensorimotor experiences, one needs a representative corpus of such interplay. In this paper, we will present the first corpus of language use and sensorimotor experience recordings in everyday human:human interaction, in which spontaneous language communication has been recorded along with corresponding multiview video recordings, recordings of 3D full body kinematics, and 3D tracking of objects in focus. It is a twelve-hour corpus which comprises of six everyday human:human interaction scenes, each one performed 3 times by 4 different English-speaking couples (interaction between a male and a female actor), each couple acting each scene in two settings: a fully naturalistic setting in which 5-camera multi-view video recordings take place, and a high-tech setting, with full body motion capture for both individuals, a 2-camera multiview video recording, and 3D tracking of focus objects. The corpus has been developed within an EU-funded cognitive systems research project, POETICON (http://www.poeticon.eu), and represents a new type of language resources for cognitive systems. Namely, a corpus that reveals the dynamic role of language in its interplay with sensorimotor experiences and which allows one to computationally model this interplay.

[1]  Terry Winograd,et al.  Understanding natural language , 1974 .

[2]  Michael Kipp,et al.  Gesture generation by imitation: from human behavior to computer character animation , 2005 .

[3]  Florian Schiel,et al.  The SmartKom Multimodal Corpus at BAS , 2002, LREC.

[4]  Jennifer Healey,et al.  A Long-Term Evaluation of Sensing Modalities for Activity Recognition , 2007, UbiComp.

[5]  Moritz Tenorth,et al.  The TUM Kitchen Data Set of everyday manipulation activities for motion tracking and action recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[6]  Mark Liberman,et al.  Transcriber: Development and use of a tool for assisting speech corpora production , 2001, Speech Commun..

[7]  Florian Schiel,et al.  Talking and Looking: the SmartWeb Multimodal Interaction Corpus , 2008, LREC.

[8]  Yorick Wilks,et al.  Vision-Language Integration in AI: A Reality Check , 2004, ECAI.

[9]  Katerina Pastra,et al.  COSMOROE: a cross-media relations framework for modelling multimedia dialectics , 2008, Multimedia Systems.

[10]  Eric Sanders,et al.  Promoting free Dialog Video Corpora: The IFADV Corpus Example , 2009, Multimodal Corpora.

[11]  Jessica K. Hodgins,et al.  Guide to the Carnegie Mellon University Multimodal Activity (CMU-MMAC) Database , 2008 .

[12]  Katerina Pastra PRAXICON : The Development of a Grounding Resource , 2008 .

[13]  Kent Larson,et al.  Using a Live-In Laboratory for Ubiquitous Computing Research , 2006, Pervasive.

[14]  Tatiana Y. Sherstinova The Structure of the ORD Speech Corpus of Russian Everyday Communication , 2009, TSD.

[15]  Patrick Brézillon,et al.  Lecture Notes in Artificial Intelligence , 1999 .

[16]  Yorick Wilks,et al.  Dialogue, Speech and Images: the Companions Project Data Set , 2008, LREC.

[17]  Jean Carletta,et al.  Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting Corpus , 2007, Lang. Resour. Evaluation.

[18]  François Brémond,et al.  Multisensor Fusion for Monitoring Elderly Activities at Home , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.