An Efficient Recurrent Adversarial Framework for Unsupervised Real-Time Video Enhancement

Video enhancement is a challenging problem, more than that of stills, mainly due to high computational cost, larger data volumes and the difficulty of achieving consistency in the spatio-temporal domain. In practice, these challenges are often coupled with the lack of example pairs, which inhibits the application of supervised learning strategies. To address these challenges, we propose an efficient adversarial video enhancement framework that learns directly from unpaired video examples. In particular, our framework introduces new recurrent cells that consist of interleaved local and global modules for implicit integration of spatial and temporal information. The proposed design allows our recurrent cells to efficiently propagate spatio-temporal information across frames and reduces the need for high complexity networks. Our setting enables learning from unpaired videos in a cyclic adversarial manner, where the proposed recurrent units are employed in all architectures. Efficient training is accomplished by introducing one single discriminator that learns the joint distribution of source and target domain simultaneously. The enhancement results demonstrate clear superiority of the proposed video enhancer over the state-of-theart methods, in all terms of visual quality, quantitative metrics, and inference speed. Notably, our video enhancer is capable of enhancing over 35 frames per second of FullHD video (1080x1920).

[1]  Christine Guillemot,et al.  Depth Estimation with Occlusion Handling from a Sparse Set of Light Field Views , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[2]  Matthew Uyttendaele,et al.  Deep Burst Denoising , 2017, ECCV.

[3]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[4]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[6]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[7]  Jonathan T. Barron,et al.  Burst Denoising with Kernel Prediction Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Renjie Liao,et al.  Detail-Revealing Deep Video Super-Resolution , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  In-So Kweon,et al.  Distort-and-Recover: Color Enhancement Using Deep Reinforcement Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Luc Van Gool,et al.  DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[11]  Xianming Liu,et al.  AIM 2019 Challenge on Video Extreme Super-Resolution: Methods and Results , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[12]  Radu Timofte,et al.  Efficient Video Super-Resolution through Recurrent Latent Space Propagation , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[13]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Sjoerd van Steenkiste,et al.  Towards Accurate Generative Models of Video: A New Metric & Challenges , 2018, ArXiv.

[15]  Seoung Wug Oh,et al.  Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Matthew A. Brown,et al.  Frame-Recurrent Video Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Dong Liu,et al.  A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding , 2016, MMM.

[18]  Zulin Wang,et al.  Multi-frame Quality Enhancement for Compressed Video , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Alexia Jolicoeur-Martineau,et al.  The relativistic discriminator: a key element missing from standard GAN , 2018, ICLR.

[20]  Chen Change Loy,et al.  EDVR: Video Restoration With Enhanced Deformable Convolutional Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[21]  Abderrahim Elmoataz,et al.  Nonlocal video denoising, simplification and inpainting using discrete regularization on graphs , 2010, Signal Process..

[22]  Hao He,et al.  Exposure , 2017, ACM Trans. Graph..

[23]  Dan Xia,et al.  AIM 2020 Challenge on Video Extreme Super-Resolution: Methods and Results , 2020, ECCV Workshops.

[24]  Stefan Roth,et al.  Deep Video Deblurring: The Devil is in the Details , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[25]  Feng Liu,et al.  Video Frame Interpolation via Adaptive Convolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Yu Qiao,et al.  ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks , 2018, ECCV Workshops.

[27]  Houqiang Li,et al.  Multi-Level Video Frame Interpolation: Exploiting the Interaction Among Different Levels , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Yang Zhou,et al.  End-To-End Trainable Video Super-Resolution Based on a New Mechanism for Implicit Motion Estimation and Compensation , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[29]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[30]  Horst Bischof,et al.  Optical Flow Guided TV-L1 Video Interpolation and Restoration , 2011, EMMCVPR.

[31]  William T. Freeman,et al.  A High-Quality Video Denoising Algorithm Based on Reliable Motion Estimation , 2010, ECCV.

[32]  Radu Timofte,et al.  NTIRE 2020 Challenge on Video Quality Mapping: Methods and Results , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[33]  Yung-Yu Chuang,et al.  Deep Photo Enhancer: Unpaired Learning for Image Enhancement from Photographs with GANs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Zhou Wang,et al.  Video Denoising Based on a Spatiotemporal Gaussian Scale Mixture Model , 2010, IEEE Transactions on Circuits and Systems for Video Technology.

[35]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[36]  Luc Van Gool,et al.  WESPE: Weakly Supervised Photo Enhancer for Digital Cameras , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[37]  Feng Liu,et al.  Context-Aware Synthesis for Video Frame Interpolation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Jiebo Luo,et al.  Unsupervised Real-world Low-light Image Enhancement with Decoupled Networks , 2020, ArXiv.

[39]  Frédo Durand,et al.  Burst Image Deblurring Using Permutation Invariant Convolutional Neural Networks , 2018, ECCV.

[40]  Guillermo Sapiro,et al.  Deep Video Deblurring for Hand-Held Cameras , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Tingting Wang,et al.  A Novel Deep Learning-Based Method of Improving Coding Efficiency from the Decoder-End for HEVC , 2017, 2017 Data Compression Conference (DCC).

[42]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]  Sam Kwong,et al.  Towards Unsupervised Deep Image Enhancement With Generative Adversarial Network , 2020, IEEE Transactions on Image Processing.

[44]  Zhiwu Huang,et al.  The Vid3oC and IntVID Datasets for Video Super Resolution and Quality Mapping , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[45]  Alberto Del Bimbo,et al.  Fast Video Quality Enhancement using GANs , 2019, ACM Multimedia.

[46]  Xianming Liu,et al.  Robust Video Super-Resolution with Learned Temporal Dynamics , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[47]  Karen O. Egiazarian,et al.  Video Denoising, Deblocking, and Enhancement Through Separable 4-D Nonlocal Spatiotemporal Transforms , 2012, IEEE Transactions on Image Processing.

[48]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[49]  Toshihiko Yamasaki,et al.  Unpaired Image Enhancement Featuring Reinforcement-Learning-Controlled Image Editing Software , 2019, AAAI.

[50]  Yann LeCun,et al.  Deep multi-scale video prediction beyond mean square error , 2015, ICLR.

[51]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[52]  Jan Kautz,et al.  Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[53]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).