Context-based video frame interpolation via depthwise over-parameterized convolution

Abstract. Video frame interpolation is used to generate intermediate frames by estimating the movement of pixels between the input frames. However, problems of blurring, object occlusion, and sudden brightness changes occur in naturally obtained video frames. We propose a context-based video frame interpolation method via depthwise over-parameterized convolution. First, the proposed network obtains the context graphs of the input frames. Subsequently, an adaptive collaboration of flows is adopted to warp the input frames and the context graphs. Then, the frame synthesis network is used to fuse the warped input frames and context graphs to obtain a preliminary estimate of the interpolated frame. Finally, a post-processing module is employed to refine the result. Experimental results on several datasets demonstrate that the proposed method performs qualitatively and quantitatively better than state-of-the-art methods.

[1]  Zhihui Zhu,et al.  CDFI: Compression-Driven Network Design for Frame Interpolation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Zhiyong Gao,et al.  MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  A. Parihar,et al.  A comprehensive survey on video frame interpolation techniques , 2021, The Visual Computer.

[4]  Neena Aloysius,et al.  A review on deep convolutional neural networks , 2017, 2017 International Conference on Communication and Signal Processing (ICCSP).

[5]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Feng Liu,et al.  Video Frame Interpolation via Adaptive Separable Convolution , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Yong Man Ro,et al.  Video Frame Interpolation Via Exceptional Motion-Aware Synthesis , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Ting Liu,et al.  Recent advances in convolutional neural networks , 2015, Pattern Recognit..

[10]  Feng Liu,et al.  Context-Aware Synthesis for Video Frame Interpolation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Liquan Shen,et al.  Fine-Grained Motion Estimation for Video Frame Interpolation , 2021, IEEE Transactions on Broadcasting.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Taeoh Kim,et al.  Extrapolative-Interpolative Cycle-Consistency Learning For Video Frame Extrapolation , 2020, 2020 IEEE International Conference on Image Processing (ICIP).

[14]  Yung-Yu Chuang,et al.  Deep Video Frame Interpolation Using Cyclic Frame Generation , 2019, AAAI.

[15]  Chang-Su Kim,et al.  BMBC: Bilateral Motion Estimation with Bilateral Cost Volume for Video Interpolation , 2020, ECCV.

[16]  Asifullah Khan,et al.  A survey of the recent architectures of deep convolutional neural networks , 2019, Artificial Intelligence Review.

[17]  Xiaoou Tang,et al.  Video Frame Synthesis Using Deep Voxel Flow , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[19]  Michel Barlaud,et al.  Two deterministic half-quadratic regularization algorithms for computed imaging , 1994, Proceedings of 1st International Conference on Image Processing.

[20]  Li Chen,et al.  Blurry Video Frame Interpolation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Xiaohong Liu,et al.  Video Frame Interpolation via Generalized Deformable Convolution , 2022, IEEE Transactions on Multimedia.

[22]  Zhenzhong Chen,et al.  Multiple Video Frame Interpolation via Enhanced Deformable Separable Convolution , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Haopeng Li,et al.  Video Frame Interpolation Via Residue Refinement , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  Jingning Han,et al.  Co-located Reference Frame Interpolation Using Optical Flow Estimation for Video Compression , 2018, 2018 Data Compression Conference.

[25]  Chris Yakopcic,et al.  A State-of-the-Art Survey on Deep Learning Theory and Architectures , 2019, Electronics.

[26]  Feng Liu,et al.  Video Frame Interpolation via Adaptive Convolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[28]  Xiaoyun Zhang,et al.  Depth-Aware Video Frame Interpolation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[30]  Feng Liu,et al.  Softmax Splatting for Video Frame Interpolation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Yong Man Ro,et al.  Robust Video Frame Interpolation With Exceptional Motion Map , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[32]  Kyoung Mu Lee,et al.  Scene-Adaptive Video Frame Interpolation via Meta-Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Max Grosse,et al.  Phase-based frame interpolation for video , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Jan Kautz,et al.  Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Mubarak Shah,et al.  UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[36]  Jiajun Wu,et al.  Video Enhancement with Task-Oriented Flow , 2018, International Journal of Computer Vision.

[37]  Rae-Hong Park,et al.  Coarse-to-fine frame interpolation for frame rate up-conversion using pyramid structure , 2003, IEEE Trans. Consumer Electron..

[38]  Hongdong Li,et al.  Learning Image Matching by Simply Watching Video , 2016, ECCV.

[39]  Dani Lischinski,et al.  DO-Conv: Depthwise Over-Parameterized Convolutional Layer , 2020, IEEE Transactions on Image Processing.

[40]  Luc Van Gool,et al.  A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[42]  Bohyung Han,et al.  Channel Attention Is All You Need for Video Frame Interpolation , 2020, AAAI.

[43]  Jan Kautz,et al.  Unsupervised Video Interpolation Using Cycle Consistency , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[44]  Markus H. Gross,et al.  PhaseNet for Video Frame Interpolation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Alain Trémeau,et al.  Residual Conv-Deconv Grid Network for Semantic Segmentation , 2017, BMVC.

[46]  Taeoh Kim,et al.  AdaCoF: Adaptive Collaboration of Flows for Video Frame Interpolation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).