Deep Generative Modelling of Human Reach-and-Place Action

The motion of picking up and placing an object in 3D space is full of subtle detail. Typically these motions are formed from the same constraints, optimizing for swiftness, energy efficiency, as well as physiological limits. Yet, even for identical goals, the motion realized is always subject to natural variation. To capture these aspects computationally, we suggest a deep generative model for human reach-and-place action, conditioned on a start and end position.We have captured a dataset of 600 such human 3D actions, to sample the 2x3-D space of 3D source and targets. While temporal variation is often modeled with complex learning machinery like recurrent neural networks or networks with memory or attention, we here demonstrate a much simpler approach that is convolutional in time and makes use of(periodic) temporal encoding. Provided a latent code and conditioned on start and end position, the model generates a complete 3D character motion in linear time as a sequence of convolutions. Our evaluation includes several ablations, analysis of generative diversity and applications.

[1]  Sergey Levine,et al.  Continuous character control with low-dimensional embeddings , 2012, ACM Trans. Graph..

[2]  Nicu Sebe,et al.  Animating Arbitrary Objects via Deep Motion Transfer , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Vladlen Koltun,et al.  An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling , 2018, ArXiv.

[4]  Michael J. Black,et al.  Deep Inertial Poser: Learning to Reconstruct Human Pose from Sparse Inertial Measurements in Real Time , 2018 .

[5]  Jonas Beskow,et al.  MoGlow , 2019, ACM Trans. Graph..

[6]  Okan Arikan,et al.  Interactive motion generation from examples , 2002, ACM Trans. Graph..

[7]  Michiel van de Panne,et al.  Character controllers using motion VAEs , 2020, ACM Trans. Graph..

[8]  Lucas Kovar,et al.  Motion Graphs , 2002, ACM Trans. Graph..

[9]  Taku Komura,et al.  A Deep Learning Framework for Character Motion Synthesis and Editing , 2016, ACM Trans. Graph..

[10]  Otmar Hilliges,et al.  Learning Human Motion Models for Long-Term Predictions , 2017, 2017 International Conference on 3D Vision (3DV).

[11]  Jessica K. Hodgins,et al.  Interactive control of avatars animated with human motion data , 2002, SIGGRAPH.

[12]  Michael Neff,et al.  Multi-objective adversarial gesture generation , 2019, MIG.

[13]  Andrea Vedaldi,et al.  Deep Image Prior , 2017, International Journal of Computer Vision.

[14]  Daniel Holden,et al.  DReCon , 2019, ACM Trans. Graph..

[15]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[16]  Jitendra Malik,et al.  Recurrent Network Models for Human Dynamics , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Dani Lischinski,et al.  Unpaired motion style transfer from video to animation , 2020, ACM Trans. Graph..

[18]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[19]  Sergey Levine,et al.  DeepMimic , 2018, ACM Trans. Graph..

[20]  C. Karen Liu,et al.  Learning symmetric and low-energy locomotion , 2018, ACM Trans. Graph..

[21]  Taku Komura,et al.  Learning motion manifolds with convolutional autoencoders , 2015, SIGGRAPH Asia Technical Briefs.

[22]  Sebastian Starke,et al.  Neural state machine for character-scene interactions , 2019, ACM Trans. Graph..

[23]  Taku Komura,et al.  Phase-functioned neural networks for character control , 2017, ACM Trans. Graph..

[24]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[25]  Michael F. Cohen,et al.  Verbs and Adverbs: Multidimensional Motion Interpolation , 1998, IEEE Computer Graphics and Applications.

[26]  Jessica K. Hodgins,et al.  Construction and optimal search of interpolated motion graphs , 2007, ACM Trans. Graph..