MT-VAE: Learning Motion Transformations to Generate Multimodal Human Dynamics
暂无分享,去创建一个
Ersin Yumer | Ruben Villegas | Honglak Lee | Kalyan Sunkavalli | Sunil Hadap | Eli Shechtman | Xinchen Yan | Akash Rastogi | Honglak Lee | Ruben Villegas | Xinchen Yan | Ersin Yumer | Kalyan Sunkavalli | Sunil Hadap | Akash Rastogi | Eli Shechtman
[1] Martial Hebert,et al. An Uncertain Future: Forecasting from Static Images Using Variational Autoencoders , 2016, ECCV.
[2] Justus Thies,et al. Face2Face: real-time face capture and reenactment of RGB videos , 2019, Commun. ACM.
[3] Sergey Levine,et al. Unsupervised Learning for Physical Interaction through Video Prediction , 2016, NIPS.
[4] Tal Hassner,et al. Regressing Robust and Discriminative 3D Morphable Models with a Very Deep Neural Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Jitendra Malik,et al. Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[6] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[7] Yi Zhou,et al. Auto-Conditioned Recurrent Networks for Extended Complex Human Motion Synthesis , 2017, ICLR.
[8] Yann LeCun,et al. Deep multi-scale video prediction beyond mean square error , 2015, ICLR.
[9] Seunghoon Hong,et al. Decomposing Motion and Content for Natural Video Sequence Prediction , 2017, ICLR.
[10] Michael F. Cohen,et al. Efficient generation of motion transitions using spacetime constraints , 1996, SIGGRAPH.
[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[12] Samy Bengio,et al. Generating Sentences from a Continuous Space , 2015, CoNLL.
[13] Derek Bradley,et al. High-quality passive facial performance capture using anchor frames , 2011, ACM Trans. Graph..
[14] Thomas Vetter,et al. A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.
[15] Martial Hebert,et al. Dense Optical Flow Prediction from a Static Image , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[16] Rob Fergus,et al. Stochastic Video Generation with a Learned Prior , 2018, ICML.
[17] Sami Romdhani,et al. A 3D Face Model for Pose and Illumination Invariant Face Recognition , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.
[18] Fei Yang,et al. Facial expression editing in video using a temporally-smooth factorization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[19] Martial Hebert,et al. The Pose Knows: Video Forecasting by Generating Pose Futures , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[20] Antonio Torralba,et al. Generating Videos with Scene Dynamics , 2016, NIPS.
[21] Yuting Zhang,et al. Deep Visual Analogy-Making , 2015, NIPS.
[22] Eric P. Xing,et al. Controllable Text Generation , 2017, ArXiv.
[23] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[24] Honglak Lee,et al. Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.
[25] Ivan Laptev,et al. On Space-Time Interest Points , 2005, International Journal of Computer Vision.
[26] Sergey Levine,et al. Time-Contrastive Networks: Self-Supervised Learning from Video , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[27] Alexei A. Efros,et al. Toward Multimodal Image-to-Image Translation , 2017, NIPS.
[28] Fei Yang,et al. Expression flow for 3D-aware face component transfer , 2011, SIGGRAPH 2011.
[29] Ruben Villegas,et al. Hierarchical Long-term Video Prediction without Supervision , 2018, ICML.
[30] Daniel Cohen-Or,et al. Bringing portraits to life , 2017, ACM Trans. Graph..
[31] Douglas Eck,et al. A Neural Representation of Sketch Drawings , 2017, ICLR.
[32] Tamara L. Berg,et al. Learning Temporal Transformations from Time-Lapse Videos , 2016, ECCV.
[33] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[34] Guoying Zhao,et al. Aff-Wild: Valence and Arousal ‘In-the-Wild’ Challenge , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[35] Jan Kautz,et al. MoCoGAN: Decomposing Motion and Content for Video Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[36] Cristian Sminchisescu,et al. Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[37] Thomas Brox,et al. FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[38] H. Seidel,et al. Pattern-aware Deformation Using Sliding Dockers , 2011, SIGGRAPH 2011.
[39] Sergey Levine,et al. Stochastic Variational Video Prediction , 2017, ICLR.
[40] Ying Wu,et al. Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[41] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[42] Jitendra Malik,et al. Recurrent Network Models for Human Dynamics , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[43] Ira Kemelmacher-Shlizerman,et al. What Makes Tom Hanks Look Like Tom Hanks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[44] Ruben Villegas,et al. Learning to Generate Long-term Future via Hierarchical Prediction , 2017, ICML.
[45] Vighnesh Birodkar,et al. Unsupervised Learning of Disentangled Representations from Video , 2017, NIPS.
[46] Jiajun Wu,et al. Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks , 2016, NIPS.
[47] Ira Kemelmacher-Shlizerman,et al. Synthesizing Obama , 2017, ACM Trans. Graph..
[48] Hans-Peter Seidel,et al. Performance capture from sparse multi-view video , 2008, ACM Trans. Graph..
[49] Scott E. Reed,et al. Weakly-supervised Disentangling with Recurrent Transformations for 3D View Synthesis , 2015, NIPS.
[50] Nitish Srivastava,et al. Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.
[51] Eric P. Xing,et al. Toward Controlled Generation of Text , 2017, ICML.
[52] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[53] Alex Graves,et al. DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.
[54] Silvio Savarese,et al. A Hierarchical Representation for Future Action Prediction , 2014, ECCV.
[55] Ali Farhadi,et al. Actions ~ Transformations , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[56] Ronen Basri,et al. Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[57] Cordelia Schmid,et al. Action recognition by dense trajectories , 2011, CVPR 2011.
[58] Kevin A. Smith,et al. Sources of uncertainty in intuitive physics , 2012, CogSci.
[59] Scott Cohen,et al. Forecasting Human Dynamics from Static Images , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[60] Geoffrey E. Hinton,et al. Transforming Auto-Encoders , 2011, ICANN.
[61] Sergey Levine,et al. Time-Contrastive Networks: Self-Supervised Learning from Multi-view Observation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[62] Alex Graves,et al. Video Pixel Networks , 2016, ICML.
[63] Christoph Bregler,et al. Learning and recognizing human dynamics in video sequences , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[64] Joshua B. Tenenbaum,et al. Deep Convolutional Inverse Graphics Network , 2015, NIPS.
[65] Xiaoming Liu,et al. Face Alignment in Full Pose Range: A 3D Total Solution , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[66] Xiangyu Zhu,et al. Face Alignment in Full Pose Range: A 3D Total Solution , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[67] Honglak Lee,et al. Attribute2Image: Conditional Image Generation from Visual Attributes , 2015, ECCV.