论文信息 - Visual and linguistic information in gesture classification

Visual and linguistic information in gesture classification

Classification of natural hand gestures is usually approached by applying pattern recognition to the movements of the hand. However, the gesture categories most frequently cited in the psychology literature are fundamentally multimodal; the definitions make reference to the surrounding linguistic context. We address the question of whether gestures are naturally multimodal, or whether they can be classified from hand-movement data alone. First, we describe an empirical study showing that the removal of auditory information significantly impairs the ability of human raters to classify gestures. Then we present an automatic gesture classification system based solely on an n-gram model of linguistic context; the system is intended to supplement a visual classifier, but achieves 66% accuracy on a three-class classification problem on its own. This represents higher accuracy than human raters achieve when presented with the same information.

Jacob Eisenstein | Randall Davis

[1] D. McNeill. Hand and Mind , 1995 .

[2] Philip R. Cohen,et al. QuickSet: multimodal interaction for distributed applications , 1997, MULTIMEDIA '97.

[3] Justine Cassell,et al. BEAT: the Behavior Expression Animation Toolkit , 2001, Life-like characters.

[4] Francis K. H. Quek,et al. Catchments, prosody and discourse , 2001 .

[5] Francis K. H. Quek,et al. Hand motion gestural oscillations and multimodal discourse , 2003, ICMI '03.

[6] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[7] Jacob Eisenstein,et al. Natural gesture in descriptive monologues , 2006, SIGGRAPH Courses.

[8] Mohammed Yeasin,et al. Prosody based co-analysis for continuous recognition of coverbal gestures , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.

[9] Steven K. Feiner,et al. Mutual disambiguation of 3D multimodal interaction in augmented and virtual reality , 2003, ICMI '03.

[10] David R. Karger,et al. Tackling the Poor Assumptions of Naive Bayes Text Classifiers , 2003, ICML.

[11] Toyoaki Nishida,et al. Converting Text into Agent Animations: Assigning Gestures to Text , 2004, HLT-NAACL.