GIF2Video: Color Dequantization and Temporal Interpolation of GIF Images

Graphics Interchange Format (GIF) is a highly portable graphics format that is ubiquitous on the Internet. Despite their small sizes, GIF images often contain undesirable visual artifacts such as flat color regions, false contours, color shift, and dotted patterns. In this paper, we propose GIF2Video, the first learning-based method for enhancing the visual quality of GIFs in the wild. We focus on the challenging task of GIF restoration by recovering information lost in the three steps of GIF creation: frame sampling, color quantization, and color dithering. We first propose a novel CNN architecture for color dequantization. It is built upon a compositional architecture for multi-step color correction, with a comprehensive loss function designed to handle large quantization errors. We then adapt the SuperSlomo network for temporal interpolation of GIF frames. We introduce two large datasets, namely GIF-Faces and GIF-Moments, for both training and evaluation. Experimental results show that our method can significantly improve the visual quality of GIFs, and outperforms direct baseline and state-of-the-art approaches.

[1]  Oscar C. Au,et al.  Image Bit-Depth Enhancement via Maximum A Posteriori Estimation of AC Signal , 2016, IEEE Transactions on Image Processing.

[2]  Hao Li,et al.  SiCloPe: Silhouette-Based Clothed People , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Ruigang Yang,et al.  Identity Preserving Face Completion for Large Ocular Region Occlusion , 2018, BMVC.

[5]  Hongdong Li,et al.  Learning Image Matching by Simply Watching Video , 2016, ECCV.

[6]  Chuan Wang,et al.  Video Inpainting by Jointly Learning Temporal Structure and Spatial Details , 2018, AAAI.

[7]  Andreas Rössler,et al.  FaceForensics: A Large-scale Video Dataset for Forgery Detection in Human Faces , 2018, ArXiv.

[8]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9]  Jan Kautz,et al.  Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Sitaram Bhagavathy,et al.  Multiscale Probabilistic Dithering for Suppressing Contour Artifacts in Digital Images , 2009, IEEE Transactions on Image Processing.

[11]  Jian Yang,et al.  Image Super-Resolution via Deep Recursive Residual Network , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Scott J. Daly,et al.  Decontouring: prevention and removal of false contour artifacts , 2004, IS&T/SPIE Electronic Imaging.

[13]  Chuan Wang,et al.  Video Object Co-Segmentation via Subspace Clustering and Quadratic Pseudo-Boolean Optimization in an MRF Framework , 2014, IEEE Transactions on Multimedia.

[14]  Paul S. Heckbert Color image quantization for frame buffer display , 1998 .

[15]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[16]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[17]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[18]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Chuan Wang,et al.  Look, Listen and Learn - A Multimodal LSTM for Speaker Identification , 2016, AAAI.

[20]  King Ngi Ngan,et al.  Composite Model-Based DC Dithering for Suppressing Contour Artifacts in Decompressed Video , 2011, IEEE Transactions on Image Processing.

[21]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[23]  Jie Zhu,et al.  Video Vectorization via Tetrahedral Remeshing , 2017, IEEE Transactions on Image Processing.

[24]  Kwanghoon Sohn,et al.  In-loop selective processing for contour artefact reduction in video coding , 2009 .

[25]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[26]  David J. Fleet,et al.  Performance of optical flow techniques , 1994, International Journal of Computer Vision.

[27]  Feng Liu,et al.  Video Frame Interpolation via Adaptive Separable Convolution , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Evan Herbst,et al.  Occlusion Reasoning for Temporal Interpolation using Optical Flow , 2009 .

[29]  Michael J. Black,et al.  A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[30]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[31]  Jae-Seung Kim,et al.  Flat-Region Detection and False Contour Removal in the Digital TV Display , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Feng Liu,et al.  Video Frame Interpolation via Adaptive Convolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  C.-C. Jay Kuo,et al.  Understanding and Removal of False Contour in HEVC Compressed Images , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[35]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[36]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[37]  Chongyang Ma,et al.  Deep Volumetric Video From Very Sparse Multi-view Performance Capture , 2018, ECCV.

[38]  Yutao Liu,et al.  Bit-Depth Enhancement via Convolutional Neural Network , 2017, IFTC.

[39]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[40]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[41]  Rae-Hong Park,et al.  Two-stage false contour detection using directional contrast and its application to adaptive false contour reduction , 2006, IEEE Trans. Consumer Electron..

[42]  Xiaoou Tang,et al.  Video Frame Synthesis Using Deep Voxel Flow , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[43]  Yale Song,et al.  Video2GIF: Automatic Generation of Animated GIFs from Video , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).