EVSRNet: Efficient Video Super-Resolution with Neural Architecture Search

With the development of convolutional neural networks (CNN), the super-resolution results of CNN-based method have far surpassed traditional method. In particular, the CNN-based single image super-resolution method has achieved excellent results. Video sequences contain more abundant information compare with image, but there are few video super-resolution methods that can be applied to mobile devices due to the requirement of heavy computation, which limits the application of video super-resolution. In this work, we propose the Efficient Video Super-Resolution Network (EVSRNet) with neural architecture search for real-time video super-resolution. Extensive experiments show that our method achieves a good balance between quality and efficiency. Finally, we achieve a competitive result of 7.36 where the PSNR is 27.85 dB and the inference time is 11.3 ms/f on the target snapdragon 865 SoC, resulting in a 2nd place in the Mobile AI (MAI) 2021 real-time video super-resolution challenge. It is noteworthy that, our method is the fastest and significantly outperforms other competitors by large margins.

[1]  Xu Jia,et al.  Revisiting Temporal Modeling for Video Super-resolution , 2020, BMVC.

[2]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Luc Van Gool,et al.  AI Benchmark: All About Deep Learning on Smartphones in 2019 , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[4]  Chen Change Loy,et al.  EDVR: Video Restoration With Enhanced Deformable Convolutional Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5]  Tao Lu,et al.  Multi-Memory Convolutional Neural Network for Video Super-Resolution , 2019, IEEE Transactions on Image Processing.

[6]  Radu Timofte,et al.  NTIRE 2019 Challenge on Video Deblurring and Super-Resolution: Dataset and Study , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[7]  Ke Wang,et al.  AI Benchmark: Running Deep Neural Networks on Android Smartphones , 2018, ECCV Workshops.

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Lizhuang Ma,et al.  Efficient Super Resolution Using Binarized Neural Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[10]  Tao Mei,et al.  Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning , 2019, AAAI.

[11]  Bohyung Han,et al.  Fine-Grained Neural Architecture Search , 2019, ArXiv.

[12]  Jie Liu,et al.  Residual Feature Distillation Network for Lightweight Image Super-Resolution , 2020, ECCV Workshops.

[13]  Xinbo Gao,et al.  Lightweight Image Super-Resolution with Information Multi-distillation Network , 2019, ACM Multimedia.

[14]  Shu-Tao Xia,et al.  Second-Order Attention Network for Single Image Super-Resolution , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Xiaoou Tang,et al.  Learning a Deep Convolutional Network for Image Super-Resolution , 2014, ECCV.

[16]  Daniel Rueckert,et al.  Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Kyoung Mu Lee,et al.  Enhanced Deep Residual Networks for Single Image Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[18]  Radu Timofte,et al.  Real-Time Video Super-Resolution on Smartphones with Deep Learning, Mobile AI 2021 Challenge: Report , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[19]  Aggelos K. Katsaggelos,et al.  Video Super-Resolution With Convolutional Neural Networks , 2016, IEEE Transactions on Computational Imaging.