论文信息 - Temporal Modulation Network for Controllable Space-Time Video Super-Resolution

Temporal Modulation Network for Controllable Space-Time Video Super-Resolution

Space-time video super-resolution (STVSR) aims to increase the spatial and temporal resolutions of low-resolution and low-frame-rate videos. Recently, deformable convolution based methods have achieved promising STVSR performance, but they could only infer the intermediate frame pre-defined in the training stage. Besides, these methods undervalued the short-term motion cues among adjacent frames. In this paper, we propose a Temporal Modulation Network (TMNet) to interpolate arbitrary intermediate frame(s) with accurate high-resolution reconstruction. Specifically, we propose a Temporal Modulation Block (TMB) to modulate deformable convolution kernels for controllable feature interpolation. To well exploit the temporal information, we propose a Locally-temporal Feature Comparison (LFC) module, along with the Bi-directional Deformable ConvLSTM, to extract short-term and long-term motion cues in videos. Experiments on three benchmark datasets demonstrate that our TMNet outperforms previous STVSR methods. The code is available at https://github.com/CS-GangXu/TMNet.

[1] Shi-Min Hu,et al. Jittor: a novel deep learning framework with meta-operators and unified graph execution , 2020, Science China Information Sciences.

[2] Ming-Ming Cheng,et al. JCS: An Explainable COVID-19 Diagnosis System by Joint Classification and Segmentation , 2020, IEEE Transactions on Image Processing.

[3] Jan Kautz,et al. Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4] Chen Change Loy,et al. EDVR: Video Restoration With Enhanced Deformable Convolutional Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6] Zhiyong Gao,et al. MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8] Gregory Shakhnarovich,et al. Recurrent Back-Projection Network for Video Super-Resolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Matthew F. Tang,et al. Neural dynamics of the attentional blink revealed by encoding orientation selectivity during rapid visual presentation , 2019, Nature Communications.

[10] Norimichi Ukita,et al. Space-Time-Aware Multi-Resolution Video Enhancement , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Christian Ledig,et al. Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Wei Wang,et al. CFSNet: Toward a Controllable Feature Space for Image Restoration , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13] Guillermo Sapiro,et al. Deep Video Deblurring for Hand-Held Cameras , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Dit-Yan Yeung,et al. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[15] Yu Qiao,et al. Multi-Dimension Modulation for Image Restoration with Dynamic Controllable Residual Learning , 2019, ArXiv.

[16] Chenliang Xu,et al. TDAN: Temporally-Deformable Alignment Network for Video Super-Resolution , 2018, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Yun Fu,et al. Image Super-Resolution Using Very Deep Residual Channel Attention Networks , 2018, ECCV.

[18] Subhashis Banerjee,et al. Space-Time Super-Resolution Using Graph-Cut Optimization , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[20] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[21] Feng Liu,et al. Video Frame Interpolation via Adaptive Separable Convolution , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22] Yaron Caspi,et al. Under the supervision of , 2003 .

[23] Yu Qiao,et al. Modulating Image Restoration With Continual Levels via Adaptive Feature Modification Layers , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[25] Xiaoyun Zhang,et al. Depth-Aware Video Frame Interpolation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Feng Liu,et al. Softmax Splatting for Video Frame Interpolation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Kyoung Mu Lee,et al. Enhanced Deep Residual Networks for Single Image Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[28] Feng Liu,et al. Context-Aware Synthesis for Video Frame Interpolation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29] Thomas S. Huang,et al. Multiframe image restoration and registration , 1984 .

[30] Yaron Caspi,et al. Increasing Space-Time Resolution in Video , 2002, ECCV.

[31] Michal Irani,et al. Space-time super-resolution from a single video , 2011, CVPR 2011.

[32] Xiaoou Tang,et al. Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33] Narendra Ahuja,et al. Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Seoung Wug Oh,et al. Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35] Jan P. Allebach,et al. Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Soo Ye Kim,et al. FISR: Deep Joint Frame Interpolation and Super-Resolution with A Multi-scale Temporal Loss , 2020, AAAI.

[37] Taeoh Kim,et al. AdaCoF: Adaptive Collaboration of Flows for Video Frame Interpolation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Xiaoou Tang,et al. Deep Network Interpolation for Continuous Imagery Effect Transition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Chao Ren,et al. Space-time super-resolution with patch group cuts prior , 2015, Signal Process. Image Commun..

[40] Stephen Lin,et al. Deformable ConvNets V2: More Deformable, Better Results , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42] Deqing Sun,et al. A Bayesian approach to adaptive video super resolution , 2011, CVPR 2011.

[43] Daniel Rueckert,et al. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44] Thomas Brox,et al. FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[45] Jiajun Wu,et al. Video Enhancement with Task-Oriented Flow , 2018, International Journal of Computer Vision.

[46] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[47] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48] Donald Geman,et al. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .