Linguistically valid movement behavior measured non-invasively

We use optical flow to extract reliable kinematics from video for motions of the head, face, torso, and hands during speech and musical performance. Unlike dot- and marker- based measures, these markerless measures are non-invasive and require no a priori specification of measurement locations. Reliability is compared with marker tracking data and the method’s utility is demonstrated for data from Plains Cree, English, and Shona. Index Terms: optical flow, kinematics, non-invasive measures. 1. Overview Since the mid-1990’s [1], we have been keen to develop videobased tools for measuring spoken communication that would be computationally tractable, reliable, non-invasive, and not restricted to laboratory recording equipment and conditions. At that time, digital image processing was cumbersome and expensive, and everyone thought that video images had to be of the highest resolution possible in order to withstand fine-grained analysis (e.g., [2]). The technology has improved dramatically and we now know that the visible attributes of spoken communication tend to be ubiquitous, simple (e.g., linear), and accessible to perceivers at surprisingly low temporal and spatial resolutions [3, 4, 5]. Given these technical and conceptual advances, the time is ripe for video-based motion analysis tools that can be applied to inexpensively acquired video data.