Animating Your Life: Real-Time Video-to-Animation Translation

We demonstrate a video-to-animation translator, which can transform real-world video into cartoon or ink-wash animation in real-time. When users upload a video or record what they are seeing with the phone, the video-to-animation translator renders the live streaming video with cartoon or ink-wash animation style while maintaining the original contents. We formulate this task as video-to-video translation problem in the absence of any paired training examples, since the manual labeling of such paired video-animation data is cost-expensive and even unrealistic in practice. Technically, an unified unpaired video-to-video translator is utilized to explore both appearance structure and temporal continuity in video synthesis. As such, not only the visual appearance in each frame but also motion between consecutive frames are ensured to be realistic and consistent for video translation. Based on these technologies, our demonstration can be conducted on any videos in the wild and supports live video-to-animation translation, which engages users with the animated artistic expression of their life.

[1]  Ming-Yu Liu,et al.  Coupled Generative Adversarial Networks , 2016, NIPS.

[2]  Kwang In Kim,et al.  Unsupervised Attention-guided Image to Image Translation , 2018, NeurIPS.

[3]  Hyunsoo Kim,et al.  Learning to Discover Cross-Domain Relations with Generative Adversarial Networks , 2017, ICML.

[4]  Tao Mei,et al.  To Create What You Tell: Generating Videos from Captions , 2017, ACM Multimedia.

[5]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Hao Wang,et al.  Real-Time Neural Style Transfer for Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Yong-Jin Liu,et al.  CartoonGAN: Generative Adversarial Networks for Photo Cartoonization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Tao Mei,et al.  Learning Deep Intrinsic Video Representation by Exploring Temporal Coherence and Graph Structure , 2016, IJCAI.

[9]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[10]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[11]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Antonio Torralba,et al.  Generating Videos with Scene Dynamics , 2016, NIPS.

[13]  Tao Mei,et al.  Video Captioning with Transferred Semantic Attributes , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Tao Mei,et al.  Seeing Bot , 2017, SIGIR.

[15]  Xin Wang,et al.  Crossing-Domain Generative Adversarial Networks for Unsupervised Multi-Domain Image-to-Image Translation , 2018, ACM Multimedia.

[16]  Tao Mei,et al.  Relation Distillation Networks for Video Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).