On the Annotation of Multimodal Behavior and Computation of Cooperation Between Modalities

With the success of multimedia and mobile devices, humancomputer interfaces combining several communication modalities such as speech and gesture may lead to more "natural" humancomputer interaction. Yet, developing multimodal interfaces requires an understanding (and thus the observation and analysis) of human multimodal behavior. In the field of annotation of multimodal corpus, there is no standardized coding scheme. In this paper, we describe a coding scheme we have developed. We give examples on how we applied it to a multimodal corpus by producing descriptions. We also provide details about the software we have developed for parsing such descriptions and for computing metrics measuring the cooperation between modalities. Although this paper is concerned with the input side (human towards machine) and thus deals with the annotation of human behavior observed in multimodal corpora, we also provide some ideas on how it might be of use for specifying cooperation between output modalities in multimodal agents.