论文信息 - Ada-VSR: Adaptive Video Super-Resolution with Meta-Learning

Ada-VSR: Adaptive Video Super-Resolution with Meta-Learning

Most of the existing works in supervised spatio-temporal video super-resolution (STVSR) heavily rely on a large-scale external dataset consisting of paired low-resolution low-frame rate (LR-LFR) and high-resolution high-frame-rate (HR-HFR) videos. Despite their remarkable performance, these methods make a prior assumption that the low-resolution video is obtained by down-scaling the high-resolution video using a known degradation kernel, which does not hold in practical settings. Another problem with these methods is that they cannot exploit instance-specific internal information of a video at testing time. Recently, deep internal learning approaches have gained attention due to their ability to utilize the instance-specific statistics of a video. However, these methods have a large inference time as they require thousands of gradient updates to learn the intrinsic structure of the data. In this work, we present Adaptive VideoSuper-Resolution (Ada-VSR) which leverages external, as well as internal, information through meta-transfer learning and internal learning, respectively. Specifically, meta-learning is employed to obtain adaptive parameters, using a large-scale external dataset, that can adapt quickly to the novel condition (degradation model) of the given test video during the internal learning task, thereby exploiting external and internal information of a video for super-resolution. The model trained using our approach can quickly adapt to a specific video condition with only a few gradient updates, which reduces the inference time significantly. Extensive experiments on standard datasets demonstrate that our method performs favorably against various state-of-the-art approaches.

[1] Deqing Sun,et al. Blind Image Deblurring Using Dark Channel Prior , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Kyoung Mu Lee,et al. Enhanced Deep Residual Networks for Single Image Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[3] Michal Irani,et al. Nonparametric Blind Super-resolution , 2013, 2013 IEEE International Conference on Computer Vision.

[4] Andrea Vedaldi,et al. Deep Image Prior , 2017, International Journal of Computer Vision.

[5] Kenji Doya,et al. Meta-learning in Reinforcement Learning , 2003, Neural Networks.

[6] Feng Liu,et al. Video Frame Interpolation via Adaptive Convolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Michael Elad,et al. Fast and robust multiframe super resolution , 2004, IEEE Transactions on Image Processing.

[8] Amit K. Roy-Chowdhury,et al. ALANET: Adaptive Latent Attention Network for Joint Video Deblurring and Interpolation , 2020, ACM Multimedia.

[9] Yu-Ting Su,et al. Spatio-Temporal Mitosis Detection in Time-Lapse Phase-Contrast Microscopy Image Sequences: A Benchmark , 2021, IEEE Transactions on Medical Imaging.

[10] Xiaoou Tang,et al. Video Frame Synthesis Using Deep Voxel Flow , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[11] Hongdong Li,et al. Learning Image Matching by Simply Watching Video , 2016, ECCV.

[12] P. Belhumeur,et al. Moving gradients: a path-based method for plausible image interpolation , 2009, SIGGRAPH 2009.

[13] Amit K. Roy-Chowdhury,et al. Non-Adversarial Video Synthesis with Learned Priors , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[15] Zhiyong Gao,et al. MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Christian Ledig,et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Richard Szeliski,et al. High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[18] Sergey Levine,et al. Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning , 2018, ICLR.

[19] Kyoung Mu Lee,et al. DynaVSR: Dynamic Adaptive Blind Video Super-Resolution , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[20] C Di Natale,et al. The influence of spatial and temporal resolutions on the analysis of cell-cell interaction: a systematic study for time-lapse microscopy applications , 2019, Scientific Reports.

[21] Gianmarco Ferri,et al. Time-lapse confocal imaging datasets to assess structural and dynamic properties of subcellular nanostructures , 2018, Scientific data.

[22] Michal Irani,et al. Blind Super-Resolution Kernel Estimation using an Internal-GAN , 2019, NeurIPS.

[23] Xiaoyun Zhang,et al. Depth-Aware Video Frame Interpolation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Xinbo Gao,et al. Lightweight Image Super-Resolution with Information Multi-distillation Network , 2019, ACM Multimedia.

[25] Gregory Shakhnarovich,et al. Deep Back-Projection Networks for Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26] Daan Wierstra,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[27] Michal Irani,et al. "Zero-Shot" Super-Resolution Using Deep Internal Learning , 2017, CVPR.

[28] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[29] Hang Li,et al. Meta-SGD: Learning to Learn Quickly for Few Shot Learning , 2017, ArXiv.

[30] Jan P. Allebach,et al. Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Thomas L. Griffiths,et al. Recasting Gradient-Based Meta-Learning as Hierarchical Bayes , 2018, ICLR.

[32] Jan Kautz,et al. Loss Functions for Neural Networks for Image Processing , 2015, ArXiv.

[33] Chen Change Loy,et al. EDVR: Video Restoration With Enhanced Deformable Convolutional Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[34] Feng Liu,et al. Video Frame Interpolation via Adaptive Separable Convolution , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35] G. Venugopala Reddy,et al. Deep Quantized Representation For Enhanced Reconstruction , 2020, 2020 IEEE 17th International Symposium on Biomedical Imaging Workshops (ISBI Workshops).

[36] Soo Ye Kim,et al. FISR: Deep Joint Frame Interpolation and Super-Resolution with A Multi-scale Temporal Loss , 2020, AAAI.

[37] Bernt Schiele,et al. Meta-Transfer Learning for Few-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Alan C. Bovik,et al. Making a “Completely Blind” Image Quality Analyzer , 2013, IEEE Signal Processing Letters.

[39] Alexandre Lacoste,et al. TADAM: Task dependent adaptive metric for improved few-shot learning , 2018, NeurIPS.

[40] Andrew Blake,et al. Motion Deblurring and Super-resolution from an Image Sequence , 1996, ECCV.

[41] Sergey Levine,et al. Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm , 2017, ICLR.

[42] Li Chen,et al. Blurry Video Frame Interpolation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43] Wojciech Matusik,et al. Moving gradients: a path-based method for plausible image interpolation , 2009, ACM Trans. Graph..

[44] Christian Ledig,et al. Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45] Wangmeng Zuo,et al. Learning a Single Convolutional Super-Resolution Network for Multiple Degradations , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[46] Stephen C. Cain,et al. Projection-based image registration in the presence of fixed-pattern noise , 2001, IEEE Trans. Image Process..

[47] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[48] Zhiwei Xiong,et al. Space-Time Video Super-Resolution Using Temporal Profiles , 2020, ACM Multimedia.

[49] Gordon J. Berman,et al. Measuring behavior across scales , 2017, BMC Biology.

[50] King-Sun Fu,et al. IEEE Transactions on Pattern Analysis and Machine Intelligence Publication Information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51] Renjie Liao,et al. Video Super-Resolution via Deep Draft-Ensemble Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[52] Michal Irani,et al. Space-time super-resolution from a single video , 2011, CVPR 2011.

[53] Aggelos K. Katsaggelos,et al. Video Super-Resolution With Convolutional Neural Networks , 2016, IEEE Transactions on Computational Imaging.

[54] Mubarak Shah,et al. Task Agnostic Meta-Learning for Few-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[55] Xiaoou Tang,et al. Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56] Joshua B. Tenenbaum,et al. Meta-Learning for Semi-Supervised Few-Shot Classification , 2018, ICLR.

[57] Deqing Sun,et al. A Bayesian approach to adaptive video super resolution , 2011, CVPR 2011.

[58] Jiajun Wu,et al. Video Enhancement with Task-Oriented Flow , 2018, International Journal of Computer Vision.

[59] Nam Ik Cho,et al. Meta-Transfer Learning for Zero-Shot Super-Resolution , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[60] Mohammed Ghanbari,et al. Scope of validity of PSNR in image/video quality assessment , 2008 .

[61] Sergey Levine,et al. Unsupervised Meta-Learning for Reinforcement Learning , 2018, ArXiv.

[62] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[63] Wangmeng Zuo,et al. Blind Super-Resolution With Iterative Kernel Correction , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[64] Jan Kautz,et al. Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[65] Soo Ye Kim,et al. Deep SR-ITM: Joint Learning of Super-Resolution and Inverse Tone-Mapping for 4K UHD HDR Applications , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[66] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .

[67] Seoung Wug Oh,et al. Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[68] Tao Xiang,et al. Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[69] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.