The SignCom system for data-driven animation of interactive virtual signers

In this article we present a multichannel animation system for producing utterances signed in French Sign Language (LSF) by a virtual character. The main challenges of such a system are simultaneously capturing data for the entire body, including the movements of the torso, hands, and face, and developing a data-driven animation engine that takes into account the expressive characteristics of signed languages. Our approach consists of decomposing motion along different channels, representing the body parts that correspond to the linguistic components of signed languages. We show the ability of this animation system to create novel utterances in LSF, and present an evaluation by target users which highlights the importance of the respective body parts in the production of signs. We validate our framework by testing the believability and intelligibility of our virtual signer.

[1]  Chung-Hsien Wu,et al.  Joint Optimization of Word Alignment and Epenthesis Generation for Chinese to Taiwanese Sign Synthesis , 2007 .

[2]  Matthew Stone,et al.  Speaking with hands: creating animated conversational characters from recordings of human performance , 2004, ACM Trans. Graph..

[3]  Norman I. Badler,et al.  Real-Time Inverse Kinematics Techniques for Anthropomorphic Limbs , 2000, Graph. Model..

[4]  T. Warabi,et al.  The reaction time of eye-head coordination in man , 1977, Neuroscience Letters.

[5]  Alexis Héloir,et al.  REAL-TIME ANIMATION OF INTERACTIVE AGENTS: SPECIFICATION AND REALIZATION , 2010, Appl. Artif. Intell..

[6]  Kenji Amaya,et al.  Emotion from Motion , 1996, Graphics Interface.

[7]  Vincenzo Lombardo,et al.  A Virtual Interpreter for the Italian Sign Language , 2010, IVA.

[8]  Nicolas Courty,et al.  A Combined Semantic and Motion Capture Database for Real-Time Sign Language Synthesis , 2009, IVA.

[9]  R. Elliott,et al.  An Overview of the SiGML Notation and SiGMLSigning Software System , 2004 .

[10]  Zhigang Deng,et al.  Natural Eye Motion Synthesis by Modeling Gaze-Head Coupling , 2009, 2009 IEEE Virtual Reality Conference.

[11]  Richard Kennaway,et al.  Experience with and Requirements for a Gesture Description Language for Synthetic Animation , 2003, Gesture Workshop.

[12]  Jessica K. Hodgins,et al.  Constraint-based motion optimization using a statistical dynamic model , 2007, ACM Trans. Graph..

[13]  Aaron Hertzmann,et al.  Style-based inverse kinematics , 2004, ACM Trans. Graph..

[14]  T. Ingold Tools, Language and Cognition , 1995 .

[15]  Stefan Kopp,et al.  The Behavior Markup Language: Recent Developments and Challenges , 2007, IVA.

[16]  David A. Forsyth,et al.  Motion synthesis from annotations , 2003, ACM Trans. Graph..

[17]  Stefan Kopp,et al.  MURML: A Multimodal Utterance Representation Markup Language for Conversational Agents , 2002 .

[18]  Han Noot,et al.  Variations in gesturing and speech by GESTYLE , 2005, Int. J. Hum. Comput. Stud..

[19]  Zhigang Deng,et al.  Animating blendshape faces by cross-mapping motion capture data , 2006, I3D '06.

[20]  W. Stokoe Sign language structure: an outline of the visual communication systems of the American deaf. 1960. , 1961, Journal of deaf studies and deaf education.

[21]  Maurizio Mancini,et al.  Implementing Expressive Gesture Synthesis for Embodied Conversational Agents , 2005, Gesture Workshop.

[22]  N. Badler,et al.  Eyes Alive Eyes Alive Eyes Alive Figure 1: Sample Images of an Animated Face with Eye Movements , 2022 .

[23]  Maneesh Agrawala,et al.  The cartoon animation filter , 2006, ACM Trans. Graph..

[24]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[25]  Norman I. Badler,et al.  Visual Attention and Eye Gaze During Multiparty Conversations with Distractions , 2006, IVA.

[26]  Norman I. Badler,et al.  Eyes alive , 2002, ACM Trans. Graph..

[27]  Matt Huenerfauth,et al.  Evaluating American Sign Language generation through the participation of native ASL signers , 2007, Assets '07.

[28]  John R. W. Glauert,et al.  Providing signed content on the Internet by synthesized animation , 2007, TCHI.

[29]  W. Stokoe,et al.  Sign language structure: an outline of the visual communication systems of the American deaf. 1960. , 1961, Journal of deaf studies and deaf education.

[30]  Michael Cohen,et al.  The cartoon animation filter , 2006, SIGGRAPH 2006.

[31]  David A. Forsyth,et al.  Generalizing motion edits with Gaussian processes , 2009, ACM Trans. Graph..

[32]  John P. Lewis,et al.  Expressive Facial Animation Synthesis by Learning Speech Coarticulation and Expression Spaces , 2006, IEEE Transactions on Visualization and Computer Graphics.

[33]  GibetSylvie,et al.  The SignCom system for data-driven animation of interactive virtual signers , 2011 .

[34]  Diane Brentari,et al.  A Prosodic Model of Sign Language Phonology , 1999 .

[35]  Stefan Kopp,et al.  Synthesizing multimodal utterances for conversational agents: Research Articles , 2004 .

[36]  Scott K. Liddell,et al.  Toward a Phonetic Representation of Signs: Sequentiality and Contrast , 2011 .

[37]  Matt Huenerfauth,et al.  A Linguistically Motivated Model for Speed and Pausing in Animations of American Sign Language , 2009, TACC.

[38]  Chung-Hsien Wu,et al.  Joint Optimization of Word Alignment and Epenthesis Generation for Chinese to Taiwanese Sign Synthesis , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Hyeong-Seok Ko,et al.  A physically-based motion retargeting filter , 2005, TOGS.

[40]  Lehel Csató,et al.  Sparse On-Line Gaussian Processes , 2002, Neural Computation.

[41]  Michael Neff,et al.  Towards Natural Gesture Synthesis: Evaluating Gesture Units in a Data-Driven Approach to Gesture Synthesis , 2007, IVA.

[42]  Trevor Johnston The lexical database of Auslan (Australian Sign Language) , 2002 .

[43]  Tomohiko Mukai,et al.  Geostatistical motion interpolation , 2005, SIGGRAPH '05.

[44]  Stefan Kopp,et al.  Synthesizing multimodal utterances for conversational agents , 2004, Comput. Animat. Virtual Worlds.

[45]  Sylvie Gibet,et al.  Heterogeneous Data Sources for Signed Language Analysis and Synthesis: The SignCom Project , 2010, LREC.

[46]  Dimitris N. Metaxas,et al.  Handshapes and Movements: Multiple-Channel American Sign Language Recognition , 2003, Gesture Workshop.

[47]  Robert E. Johnson,et al.  Hacia una representación fonética de las señas: secuencialidad y contraste , 2010 .

[48]  M. Studdert-Kennedy Hand and Mind: What Gestures Reveal About Thought. , 1994 .

[49]  Frédéric H. Pighin,et al.  Expressive speech-driven facial animation , 2005, TOGS.

[50]  Liming Zhou,et al.  Design and Evaluation of an American Sign Language Generator , 2007 .

[51]  Sylvie Gibet,et al.  High-level Specification and Animation of Communicative Gestures , 2001, J. Vis. Lang. Comput..

[52]  Sotaro Kita,et al.  Movement Phase in Signs and Co-Speech Gestures, and Their Transcriptions by Human Coders , 1997, Gesture Workshop.

[53]  Kostas Karpouzis,et al.  A knowledge-based sign synthesis architecture , 2008, Universal Access in the Information Society.

[54]  J. Cassell,et al.  Embodied conversational agents , 2000 .

[55]  Aaron Hertzmann,et al.  Style-based inverse kinematics , 2004, SIGGRAPH 2004.

[56]  Hans-Peter Seidel,et al.  Annotated New Text Engine Animation Animation Lexicon Animation Gesture Profiles MR : . . . JL : . . . Gesture Generation Video Annotated Gesture Script , 2007 .

[57]  Yong Yu,et al.  Facial animation by optimized blendshapes from motion capture data , 2008, Comput. Animat. Virtual Worlds.

[58]  Stefan Kopp,et al.  Towards a Common Framework for Multimodal Generation: The Behavior Markup Language , 2006, IVA.

[59]  C. Karen Liu,et al.  Synthesis of complex dynamic character motion from simple animations , 2002, ACM Trans. Graph..

[60]  Bobby Bodenheimer,et al.  Synthesis and evaluation of linear motion transitions , 2008, TOGS.