论文信息 - A Recurrent Variational Autoencoder for Human Motion Synthesis

A Recurrent Variational Autoencoder for Human Motion Synthesis

We propose a novel generative model of human motion that can be trained using a large motion capture dataset, and allows users to produce animations from high-level control signals. As previous architectures struggle to predict motions far into the future due to the inherent ambiguity, we argue that a user-provided control signal is desirable for animators and greatly reduces the predictive error for long sequences. Thus, we formulate a framework which explicitly introduces an encoding of control signals into a variational inference framework trained to learn the manifold of human motion. As part of this framework, we formulate a prior on the latent space, which allows us to generate high-quality motion without providing frames from an existing sequence. We further model the sequential nature of the task by combining samples from a variational approximation to the intractable posterior with the control signal through a recurrent neural network (RNN) that synthesizes the motion. We show that our system can predict the movements of the human body over long horizons more accurately than state-of-theart methods. Finally, the design of our system considers practical use cases and thus provides a competitive approach to motion synthesis.

[1] Weidi Xu,et al. Semi-supervised Variational Autoencoders for Sequence Classification , 2016, ArXiv.

[2] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3] Samy Bengio,et al. Generating Sentences from a Continuous Space , 2015, CoNLL.

[4] Geoffrey E. Hinton,et al. Factored conditional restricted Boltzmann Machines for modeling motion style , 2009, ICML '09.

[5] Yong Du,et al. Hierarchical recurrent neural network for skeleton based action recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Taku Komura,et al. Learning motion manifolds with convolutional autoencoders , 2015, SIGGRAPH Asia Technical Briefs.

[7] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[8] Silvio Savarese,et al. Structural-RNN: Deep Learning on Spatio-Temporal Graphs , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[10] Charles A. Sutton,et al. A Convolutional Attention Network for Extreme Summarization of Source Code , 2016, ICML.

[11] Taku Komura,et al. A Deep Learning Framework for Character Motion Synthesis and Editing , 2016, ACM Trans. Graph..

[12] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[13] Alex Graves,et al. DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[14] Geoffrey E. Hinton,et al. Two Distributed-State Models For Generating High-Dimensional Time Series , 2011, J. Mach. Learn. Res..

[15] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[16] Jitendra Malik,et al. Recurrent Network Models for Human Dynamics , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17] Silvio Savarese,et al. Structured Recurrent Temporal Restricted Boltzmann Machines , 2014, ICML.

[18] Danica Kragic,et al. Deep Representation Learning for Human Motion Prediction and Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Geoffrey E. Hinton,et al. Modeling Human Motion Using Binary Latent Variables , 2006, NIPS.