Iconic: speech and depictive gestures at the human-machine interface

People often communicate with a complex mixture of speech and gestures. Gestures have many different functions in human communication, some of which have been exploited at the computer interface. A largely ignored function of gestures for communicating with computers is the class of depictive gestures. These gestures are closely associated with the content of speech and complement the user’s verbal descriptions. In this class of gestures, the hands describe shape, spatial relations and movements of objects. We have developed Iconic, a prototype interface that allows users to describe the layout of three-dimensional scenes through a free mixture of speech and depictive gestures. Interpretation of this type of gestures requires an integrated approach where a high-level interpreter can simultaneously draw from clues in both the speech and gesture channels. In our system, a user’s gestures are not interpreted based on their similarity to some standard form but are only processed to an intermediate feature-based representation. By this approach, gestures can be successfully interpreted in the wider context of information from speech and the graphical domain.