Synthesis and acquisition of laban movement analysis qualitative parameters for communicative gestures

Humans use gestures in most communicative acts. How are these gestures initiated and performed? What kinds of communicative roles do they play and what kinds of meanings do they convey? How do listeners extract and understand these meanings? Will it be possible to build computerized communicating agents that can extract and understand the meanings and accordingly simulate and display expressive gestures on the computer in such a way that they can be effective conversational partners? All these questions are easy to ask, but far more difficult to answer. In the thesis we try to address these questions regarding the synthesis and acquisition of communicative gestures. Our approach to gesture is based on the principles of movement observation science, specifically Laban Movement Analysis (LMA) and its Effort and Shape components. LMA, developed in the dance community over the past seventy years, is an effective method for observing, describing, notating, and interpreting human movement to enhance communication and expression in everyday and professional life. Its Effort and Shape component provide us with a comprehensive and valuable set of parameters to characterize gesture formation. The computational model (the EMOTE system) we have built offers power and flexibility to procedurally synthesize gestures based on predefined key pose and time information plus Effort and Shape qualities. To provide real quantitative foundations for a complete communicative gesture model, we have built a computational framework where the observable characteristics of gestures—not only key pose and timing but also the underlying motion qualities—can be extracted from live performance, either in 3D motion capture data or in 2D video data, and correlated with observations validated by LMA notators. Experiments of this sort have not been conducted before and should be of interest not only to the computer animation and computer vision community but would be a powerful and valuable methodological tool for creating personalized, communicating agents.

[1]  Alan V. Oppenheim,et al.  Discrete-Time Signal Pro-cessing , 1989 .

[2]  Michael M. Cohen,et al.  Modeling Coarticulation in Synthetic Visual Speech , 1993 .

[3]  Michael F. Cohen,et al.  Interactive spacetime control for animation , 1992, SIGGRAPH.

[4]  Brian Butterworth,et al.  Gesture, speech, and computational stages: a reply to McNeill. , 1989 .

[5]  B. Butterworth,et al.  Iconic gestures, imagery, and word retrieval in speech , 1997 .

[6]  Norman I. Badler,et al.  Building parameterized action representations from observation , 2000 .

[7]  Michael Gleicher,et al.  Motion editing with spacetime constraints , 1997, SI3D.

[8]  Anders Löfqvist,et al.  Speech as Audible Gestures , 1990 .

[9]  J. Cassell Computer Vision for Human–Machine Interaction: A Framework for Gesture Generation and Interpretation , 1998 .

[10]  A. Kendon Gesticulation and Speech: Two Aspects of the Process of Utterance , 1981 .

[11]  E. Klima The signs of language , 1979 .

[12]  Tolga Capin,et al.  Avatars in Networked Virtual Environments , 1999 .

[13]  A. Pentland,et al.  Computer Vision for Human–Machine Interaction: A Framework for Gesture Generation and Interpretation , 1998 .

[14]  William T. Freeman,et al.  Television control by hand gestures , 1994 .

[15]  Edward J. Rzempoluck,et al.  Neural Network Data Analysis Using Simulnet™ , 1997, Springer New York.

[16]  Andrew Ortony,et al.  The Cognitive Structure of Emotions , 1988 .

[17]  Geoffrey E. Hinton,et al.  Glove-Talk: a neural network interface between a data-glove and a speech synthesizer , 1993, IEEE Trans. Neural Networks.

[18]  David Zeltzer,et al.  A survey of glove-based input , 1994, IEEE Computer Graphics and Applications.

[19]  F. Thomas,et al.  The illusion of life : Disney animation , 1981 .

[20]  Jaron Lanier,et al.  A hand gesture interface device , 1987, CHI 1987.

[21]  Norman I. Badler,et al.  Simulating humans: computer graphics animation and control , 1993 .

[22]  Dimitris N. Metaxas,et al.  Combining information using hard constraints , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[23]  Etienne Barnard,et al.  Avoiding false local minima by proper initialization of connections , 1992, IEEE Trans. Neural Networks.

[24]  A. W. M. van den Enden,et al.  Discrete Time Signal Processing , 1989 .

[25]  J. Cassell,et al.  Nudge nudge wink wink: elements of face-to-face conversation for embodied conversational agents , 2001 .

[26]  J. Cassell,et al.  More Than Just Another Pretty Face: Embodied Conversational Interface Agents , 1999 .

[27]  David C. Hogg Model-based vision: a program to see a walking person , 1983, Image Vis. Comput..

[28]  Norman I. Badler,et al.  Animation control for real-time virtual humans , 1999, CACM.

[29]  R. Krauss,et al.  The Communicative Value of Conversational Hand Gesture , 1995 .

[30]  James S. Lipscomb A trainable gesture recognizer , 1991, Pattern Recognit..

[31]  Dimitris N. Metaxas,et al.  ASL recognition based on a coupling between HMMs and 3D motion analysis , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[32]  V. Maletic Body - Space - Expression: The Development of Rudolf Laban's Movement and Dance Concepts , 1987 .

[33]  Norman I. Badler,et al.  Temporal scene analysis: conceptual descriptions of object movements. , 1975 .

[34]  Richard B. Reilly Applications of face and gesture recognition for human-computer interaction , 1998, MULTIMEDIA '98.

[35]  Sharon L. Oviatt,et al.  Mutual disambiguation of recognition errors in a multimodel architecture , 1999, CHI '99.

[36]  A. Kendon Movement coordination in social interaction: some examples described. , 1970, Acta psychologica.

[37]  Bernard Widrow,et al.  Neural networks: applications in industry, business and science , 1994, CACM.

[38]  William T. Freeman,et al.  Bayesian Reconstruction of 3D Human Motion from Single-Camera Video , 1999, NIPS.

[39]  Norman I. Badler,et al.  A motion control scheme for animating expressive arm movements , 1999 .

[40]  B. Depaulo,et al.  Telling lies. , 1979, Journal of personality and social psychology.

[41]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[42]  Dimitris N. Metaxas,et al.  Adapting hidden Markov models for ASL recognition by using three-dimensional computer vision methods , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[43]  C. Dell A Primer for Movement Description Using Effort Shape and Supplementary Concepts , 1970 .

[44]  Finn Verner Jensen,et al.  Introduction to Bayesian Networks , 2008, Innovations in Bayesian Networks.

[45]  T. W. Calvert,et al.  Goal-directed human animation of multiple movements , 1990 .

[46]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[47]  M. Studdert-Kennedy Hand and Mind: What Gestures Reveal About Thought. , 1994 .

[48]  C. Moore,et al.  Beyond Words: Movement Observation and Analysis , 1988 .

[49]  D. McNeill So you think gestures are nonverbal , 1985 .

[50]  Alex Pentland,et al.  The ALIVE system: wireless, full-body interaction with autonomous agents , 1997, Multimedia Systems.

[51]  I. Bartenieff,et al.  Body Movement: Coping with the Environment , 1980 .

[52]  Joseph Bates,et al.  The role of emotion in believable agents , 1994, CACM.

[53]  Craig W. Reynolds Flocks, herds, and schools: a distributed behavioral model , 1987, SIGGRAPH.

[54]  Dean Rubine,et al.  The automatic recognition of gestures , 1992 .

[55]  Richard H. Bartels,et al.  Interpolating splines with local tension, continuity, and bias control , 1984, SIGGRAPH.

[56]  Norman I. Badler,et al.  Virtual human animation based on movement observation and cognitive behavior models , 1999, Proceedings Computer Animation 1999.

[57]  Norman I. Badler,et al.  Real-Time Inverse Kinematics Techniques for Anthropomorphic Limbs , 2000, Graph. Model..

[58]  S. Pizer,et al.  The Image Processing Handbook , 1994 .

[59]  Norman I. Badler,et al.  Parametric keyframe interpolation incorporating kinetic adjustment and phrasing control , 1985, SIGGRAPH.

[60]  Sharon L. Oviatt,et al.  Perceptual user interfaces: multimodal interfaces that process what comes naturally , 2000, CACM.

[61]  Ken Perlin,et al.  Improv: a system for scripting interactive actors in virtual worlds , 1996, SIGGRAPH.

[62]  Norman I. Badler,et al.  Dynamically altering agent behaviors using natural language instructions , 2000, AGENTS '00.

[63]  Sylvie Gibet,et al.  Synthesis of Hand-Arm Gestures , 1996, Gesture Workshop.

[64]  John Lasseter,et al.  Principles of traditional animation applied to 3D computer animation , 1987, SIGGRAPH.

[65]  Anya Peterson Royce Movement and meaning : creativity and interpretation in ballet and mime , 1984 .

[66]  W. Lamb Posture and gesture : an introduction to the study of physical behaviour , 1965 .

[67]  Jessica K. Hodgins,et al.  Combining Active and Passive Simulations for Secondary Motion , 2000, IEEE Computer Graphics and Applications.

[68]  Adam Kendon,et al.  How gestures can become like words , 1988 .

[69]  D. Thalmann,et al.  A behavioral animation system for autonomous actors personified by emotions , 1998 .

[70]  Alan Wexelblat,et al.  A feature-based approach to continuous-gesture analysis , 1994 .

[71]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[72]  D. McNeill Hand and Mind: What Gestures Reveal about Thought , 1992 .

[73]  Dimitris N. Metaxas,et al.  Deformable model-based shape and motion analysis from images using motion residual error , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[74]  Norman I. Badler,et al.  A machine translation system from English to American Sign Language , 2000, AMTA.

[75]  B. Butterworth,et al.  Gesture, speech, and computational stages: a reply to McNeill. , 1989, Psychological review.

[76]  Rudolf von Laban,et al.  Effort: economy in body movement , 1974 .

[77]  Zoran Popovic,et al.  Motion warping , 1995, SIGGRAPH.

[78]  R. Laban,et al.  The mastery of movement , 1950 .

[79]  Andrew P. Witkin,et al.  Spacetime constraints , 1988, SIGGRAPH.

[80]  Bruce Blumberg,et al.  Multi-level direction of autonomous creatures for real-time virtual environments , 1995, SIGGRAPH.

[81]  Norman I. Badler,et al.  A Parameterized Action Representation for Virtual Human Agents , 1998 .

[82]  Y. J. Tejwani,et al.  Robot vision , 1989, IEEE International Symposium on Circuits and Systems,.

[83]  Barbara Hayes-Roth,et al.  Acting in Character , 2019, Creating Personalities for Synthetic Actors.

[84]  Tom Murray,et al.  Predicting sun spots using a layered perceptron neural network , 1996, IEEE Trans. Neural Networks.

[85]  Myron W. Krueger Environmental technology: making the real world virtual , 1993, CACM.

[86]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[87]  KwangYun Wohn,et al.  Recognition of hand gestures with 3D, nonlinear arm movement , 1997, Pattern Recognit. Lett..

[88]  K. Rohr Towards model-based recognition of human movements in image sequences , 1994 .

[89]  Bernard Widrow,et al.  The basic ideas in neural networks , 1994, CACM.

[90]  A. Leroi‐Gourhan,et al.  Gesture and Speech , 1993 .

[91]  Norman I. Badler,et al.  Interpreting movement manner , 2000, Proceedings Computer Animation 2000.

[92]  J. Lannoy,et al.  Gestures and Speech: Psychological Investigations , 1991 .

[93]  Zicheng Liu,et al.  Hierarchical spacetime control , 1994, SIGGRAPH.

[94]  Richard A. Bolt,et al.  “Put-that-there”: Voice and gesture at the graphics interface , 1980, SIGGRAPH '80.

[95]  Lance Williams,et al.  Motion signal processing , 1995, SIGGRAPH.

[96]  Norman I. Badler,et al.  Design of a Virtual Human Presenter , 2000, IEEE Computer Graphics and Applications.

[97]  Kenji Amaya,et al.  Emotion from Motion , 1996, Graphics Interface.

[98]  Demetri Terzopoulos,et al.  Artificial fishes: physics, locomotion, perception, behavior , 1994, SIGGRAPH.

[99]  Mark Steedman,et al.  Generating Facial Expressions for Speech , 1996, Cogn. Sci..

[100]  R. Krauss,et al.  The Role of Speech-Related Arm/Hand Gestures in Word Retrieval , 2001 .

[101]  Takeo Kanade,et al.  Visual Tracking of High DOF Articulated Structures: an Application to Human Hand Tracking , 1994, ECCV.

[102]  Kouichi Murakami,et al.  Gesture recognition using recurrent neural networks , 1991, CHI.

[103]  Michel Beaudouin-Lafon,et al.  Charade: remote control of objects using free-hand gestures , 1993, CACM.

[104]  Kristinn R. Thórisson,et al.  FACE-TO-FACE COMMUNICATION WITH COMPUTER AGENTS , 2000 .

[105]  Ken Perlin,et al.  Real Time Responsive Animation with Personality , 1995, IEEE Trans. Vis. Comput. Graph..

[106]  John Funge,et al.  Cognitive modeling: knowledge, reasoning and planning for intelligent characters , 1999, SIGGRAPH.

[107]  P. Ekman,et al.  The Repertoire of Nonverbal Behavior: Categories, Origins, Usage, and Coding , 1969 .

[108]  Michael F. Cohen,et al.  Efficient generation of motion transitions using spacetime constraints , 1996, SIGGRAPH.

[109]  Norman I. Badler,et al.  Gesticulation behaviors for virtual humans , 1998, Proceedings Pacific Graphics '98. Sixth Pacific Conference on Computer Graphics and Applications (Cat. No.98EX208).

[110]  David C. Brogan,et al.  Animating human athletics , 1995, SIGGRAPH.

[111]  Alex Pentland,et al.  A New Sense for Depth of Field , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[112]  Berthold K. P. Horn Robot vision , 1986, MIT electrical engineering and computer science series.

[113]  J. Breese,et al.  Emotion and personality in a conversational agent , 2001 .

[114]  Mark Steedman,et al.  Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents , 1994, SIGGRAPH.

[115]  Beth Levy,et al.  Conceptual Representations in Lan-guage Activity and Gesture , 1980 .

[116]  R. R. Rhinehart,et al.  A method to determine the required number of neural-network training repetitions , 1999, IEEE Trans. Neural Networks.