论文信息 - MacVisSTA: A System for Multimodal Analysis of Human Communication and Interaction

MacVisSTA: A System for Multimodal Analysis of Human Communication and Interaction

The study of embodied communication requires access to multiple data sources such as multistream video and audio, various derived and meta-data such as gesture, head, posture, facial expression and gaze information. This thesis presents the data collection, annotation, and analysis for multiple participants engaged in planning meetings. In support of the analysis tasks, this thesis presents the multimedia Visualization for Situated Temporal Analysis for Macintosh (MacVisSTA) system. It supports the analysis of multimodal human communication through the use of video, audio, speech transcriptions, and gesture and head orientation data. The system uses a multiple linked representation strategy in which different representations are linked by the current time focus. MacVisSTA supports analysis of the synchronized data at varying timescales for coarse-to-fine observational studies. The hybrid architecture may be extended through plugins. Finally, this effort has resulted in encoding of behavioral and language data, enabling collaborative research and embodying it with the aid of, and interface to, a database management system.

Richard Travis Rose

[1] Mary Ritchie Key,et al. The Relationship of Verbal and Nonverbal Communication , 1980 .

[2] Rik Van de Walle,et al. Is That a Fish in Your Ear? A Universal Metalanguage for Multimedia , 2007, IEEE MultiMedia.

[3] Ramakrishnan Srikant,et al. Fast algorithms for mining association rules , 1998, VLDB 1998.

[4] A. McCallum,et al. Practical Markov Logic Containing First-Order Quantifiers with Application to Identity Uncertainty , 2006 .

[5] Linh Anh Nguyen. The Modal Logic Programming System MProlog , 2004, JELIA.

[6] Edward A. Fox,et al. Streams, structures, spaces, scenarios, societies (5s): A formal model for digital libraries , 2004, TOIS.

[7] Francis K. H. Quek,et al. A Multimedia System for Temporally Situated Perceptual Psycholinguistic Analysis , 2002, Multimedia Tools and Applications.

[8] David McNeill,et al. Gesture and language dialectic , 2002 .

[9] Roberto García,et al. Semantic Integration and Retrieval of Multimedia Metadata , 2005, SemAnnot@ISWC.

[10] Jan-Torsten Milde,et al. Comparison of multimodal annotation tools , 2006 .

[11] A.S. d'Avila Garcez,et al. A connectionist inductive learning system for modal logic programming , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..