Synthesizing Realistic Image-based Avatars by Body Sway Analysis

We propose a method for synthesizing body sway to give human-like movement to image-based avatars. This method is based on an analysis of body sway in real people. Existing methods mainly handle the action states of avatars without sufficiently considering the wait states that exist between them. The wait state is essential for filling the periods before and after interaction. Users require both wait and action states to naturally communicate with avatars in interactive systems. Our method measures temporal changes in the body sway motion of each body part of a standing subject using a single-camera video sequence. We are able to synthesize a new video sequence with body sway over an arbitrary length of time by randomly transitioning between points in the sequence when the motion is close to zero. The results of a subjective assessment show that avatars with body sway synthesized by our method appeared more alive to users than those using baseline methods.

[1]  Daniel Thalmann,et al.  Realistic Avatars and Autonomous Virtual Humans in VLNET Networked Virtual Environments , 1998 .

[2]  Anton Leuski,et al.  All Together Now - Introducing the Virtual Human Toolkit , 2013, IVA.

[3]  Keiichi Tokuda,et al.  Mmdagent—A fully open-source toolkit for voice interaction systems , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Leonard S. Mark Joel S. Warm,et al.  Ergonomics and Human Factors: Recent Research , 1987 .

[5]  Winson C.C. Lee,et al.  Evaluation of the Microsoft Kinect as a clinical assessment tool of body sway. , 2014, Gait & posture.

[6]  Adrian Hilton,et al.  Hybrid Skeletal-Surface Motion Graphs for Character Animation from 4D Performance Capture , 2015, TOGS.

[7]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[8]  Simon Lucey,et al.  Deformable Model Fitting by Regularized Landmark Mean-Shift , 2010, International Journal of Computer Vision.

[9]  Justine Cassell,et al.  Embodied conversational interface agents , 2000, CACM.

[10]  Lewis M. Nashner,et al.  Vestibular postural control model , 1972, Kybernetik.

[11]  Kallirroi Georgila,et al.  Time-offset interaction with a holocaust survivor , 2014, IUI.

[12]  Andrew Gilbert,et al.  Social Interactive Human Video Synthesis , 2010, ACCV.

[13]  Wesley Mattheyses,et al.  Audiovisual speech synthesis: An overview of the state-of-the-art , 2015, Speech Commun..

[14]  Joel S. Warm,et al.  Ergonomics and Human Factors , 1987 .

[15]  Andrew Jones,et al.  An automultiscopic projector array for interactive digital humans , 2015, SIGGRAPH Emerging Technologies.

[16]  Björn Stenger,et al.  Expressive Visual Text-to-Speech Using Active Appearance Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Marc Fabri,et al.  The Emotional Avatar: Non-verbal Communication Between Inhabitants of Collaborative Virtual Environments , 1999, Gesture Workshop.

[18]  Takashi Matsuyama,et al.  Augmented Motion History Volume for Spatiotemporal Editing of 3-D Video in Multiparty Interaction Scenes , 2015, IEEE Trans. Circuits Syst. Video Technol..

[19]  David R. Traum,et al.  What would you Ask a conversational Agent? Observations of Human-Agent Dialogues in a Museum Setting , 2008, LREC.

[20]  James M. Keller,et al.  Body sway measurement for fall risk assessment using inexpensive webcams , 2010, 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology.

[21]  Norman I. Badler,et al.  Creating Interactive Virtual Humans: Some Assembly Required , 2002, IEEE Intell. Syst..

[22]  L. Thurstone A law of comparative judgment. , 1994 .