3D Deformable Kernels for Video super-resolution

Video super-resolution are drawing increasing attention in the computer vision community. Temporal modeling is crucial for video super-resolution. A challenge for video super-resolution to fully mining temporal-spatial information in video sequence. In this work, we propose 3D deformable kernels for video super-resolution (DK3Dnet). Specifically, we introduce 3D deformable kernels (DK3D) to integrate deformable convolution with 3D convolution to enhance spatio-temporal modeling capability. To enhance the quality of subsequent restoration. we use a Temporal and Spatial Attention fusion module (TSA fusion), in which attention is applied both temporally and spatially. Finally, we use channel-wise attention residual block (CARB) to enhance the quality of video frame in DK3Dnet reconstruction module. Experimental results show that DK3Dnet can exploiting spatio-temporal information to improve the performance of video super-resolution.

[1]  Xiaochun Cao,et al.  Video Super-Resolution via a Spatio-Temporal Alignment Network , 2022, IEEE Transactions on Image Processing.

[2]  Li Dong,et al.  Swin Transformer V2: Scaling Up Capacity and Resolution , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Shangchen Zhou,et al.  BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Luc Van Gool,et al.  SwinIR: Image Restoration Using Swin Transformer , 2021, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).

[5]  L. Gool,et al.  Video Super-Resolution Transformer , 2021, ArXiv.

[6]  Fanhua Shang,et al.  Large Motion Video Super-Resolution with Dual Subnet and Multi-Stage Communicated Upsampling , 2021, AAAI.

[7]  Dimitris N. Metaxas,et al.  Multi-Stage Feature Fusion Network for Video Super-Resolution , 2021, IEEE Transactions on Image Processing.

[8]  Chen Change Loy,et al.  BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Wen Gao,et al.  Pre-Trained Image Processing Transformer , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  S. Gelly,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[11]  Yulan Guo,et al.  Learning A Single Network for Scale-Arbitrary Super-Resolution , 2020, IEEE International Conference on Computer Vision.

[12]  Zhenbing Liu,et al.  MADNet: A Fast and Lightweight Network for Single-Image Super Resolution , 2020, IEEE Transactions on Cybernetics.

[13]  Rushi Lan,et al.  Infrared Image Super-Resolution via Transfer Learning and PSRGAN , 2021, IEEE Signal Processing Letters.

[14]  Stephen Lin,et al.  Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Weidong Sheng,et al.  Deformable 3D Convolution for Video Super-Resolution , 2020, IEEE Signal Processing Letters.

[16]  Zhibo Chen,et al.  VESR-Net: The Winning Solution to Youku Video Enhancement and Super-Resolution Challenge , 2020, ArXiv.

[17]  Li Liu,et al.  Deep Video Super-Resolution Using HR Optical Flow Estimation , 2020, IEEE Transactions on Image Processing.

[18]  Jifeng Dai,et al.  Deformable Kernels: Adapting Effective Receptive Fields for Object Deformation , 2019, ICLR.

[19]  Chenliang Xu,et al.  TDAN: Temporally-Deformable Alignment Network for Video Super-Resolution , 2018, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  A. Bernstein,et al.  3D Deformable Convolutions for MRI Classification , 2019, 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA).

[21]  Munchurl Kim,et al.  Video Super-Resolution Based on 3D-CNNS with Consideration of Scene Change , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[22]  Chen Change Loy,et al.  EDVR: Video Restoration With Enhanced Deformable Convolutional Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[23]  Bo Du,et al.  Fast Spatio-Temporal Residual Network for Video Super-Resolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Wei An,et al.  Learning Parallax Attention for Stereo Image Super-Resolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Yang Song,et al.  Class-Balanced Loss Based on Effective Number of Samples , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Stephen Lin,et al.  Deformable ConvNets V2: More Deformable, Better Results , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Narendra Ahuja,et al.  Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Junjun Jiang,et al.  A Progressively Enhanced Network for Video Satellite Imagery Superresolution , 2018, IEEE Signal Processing Letters.

[29]  Yun Fu,et al.  Image Super-Resolution Using Very Deep Residual Channel Attention Networks , 2018, ECCV.

[30]  Seoung Wug Oh,et al.  Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Gregory Shakhnarovich,et al.  Deep Back-Projection Networks for Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  W. Freeman,et al.  Video Enhancement with Task-Oriented Flow , 2017, International Journal of Computer Vision.

[33]  Narendra Ahuja,et al.  Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Renjie Liao,et al.  Detail-Revealing Deep Video Super-Resolution , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  Christian Ledig,et al.  Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Raquel Urtasun,et al.  Understanding the Effective Receptive Field in Deep Convolutional Neural Networks , 2016, NIPS.

[38]  Daniel Rueckert,et al.  Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Renjie Liao,et al.  Video Super-Resolution via Deep Draft-Ensemble Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41]  Alan C. Bovik,et al.  Motion Tuned Spatio-Temporal Quality Assessment of Natural Videos , 2010, IEEE Transactions on Image Processing.