Video Frame Interpolation with Densely Queried Bilateral Correlation

Video Frame Interpolation (VFI) aims to synthesize non-existent intermediate frames between existent frames. Flow-based VFI algorithms estimate intermediate motion fields to warp the existent frames. Real-world motions' complexity and the reference frame's absence make motion estimation challenging. Many state-of-the-art approaches explicitly model the correlations between two neighboring frames for more accurate motion estimation. In common approaches, the receptive field of correlation modeling at higher resolution depends on the motion fields estimated beforehand. Such receptive field dependency makes common motion estimation approaches poor at coping with small and fast-moving objects. To better model correlations and to produce more accurate motion fields, we propose the Densely Queried Bilateral Correlation (DQBC) that gets rid of the receptive field dependency problem and thus is more friendly to small and fast-moving objects. The motion fields generated with the help of DQBC are further refined and up-sampled with context features. After the motion fields are fixed, a CNN-based SynthNet synthesizes the final interpolated frame. Experiments show that our approach enjoys higher accuracy and less inference time than the state-of-the-art. Source code is available at https://github.com/kinoud/DQBC.

[1]  Guangtao Zhai,et al.  Enhanced Deep Animation Video Interpolation , 2022, 2022 IEEE International Conference on Image Processing (ICIP).

[2]  Qifeng Chen,et al.  Optimizing Video Prediction via Video Frame Interpolation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Jie Yang,et al.  IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  B. Curless,et al.  FILM: Frame Interpolation for Large Motion , 2022, ECCV.

[5]  Fan Zhang,et al.  ST-MFNet: A Spatio-Temporal Multi-Flow Network for Frame Interpolation , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Ming-Hsuan Yang,et al.  Video Frame Interpolation Transformer , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Matthias Zwicker,et al.  Improving the Perceptual Quality of 2D Animation Interpolation , 2021, ECCV.

[8]  Kyoung Mu Lee,et al.  Motion-Aware Dynamic Architecture for Efficient Frame Interpolation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Chang-Su Kim,et al.  Asymmetric Bilateral Motion Estimation for Video Frame Interpolation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Davide Scaramuzza,et al.  Time Lens: Event-based Video Frame Interpolation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Grzegorz Sarwas,et al.  FastRIFE: Optimization of Real-Time Intermediate Flow Estimation for Video Frame Interpolation , 2021, J. WSCG.

[12]  Dimitris N. Metaxas,et al.  Deep Animation Video Interpolation in the Wild , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Munchurl Kim,et al.  XVFI: eXtreme Video Frame Interpolation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  2020 IEEE International Conference on Image Processing (ICIP) , 2020 .

[15]  Yu Qiao,et al.  Enhanced Quadratic Video Interpolation , 2020, ECCV Workshops.

[16]  Rasoul Mohammadi Nasiri,et al.  All at Once: Temporally Adaptive Multi-Frame Interpolation with Advanced Motion Modeling , 2020, ECCV.

[17]  Chang-Su Kim,et al.  BMBC: Bilateral Motion Estimation with Bilateral Cost Volume for Video Interpolation , 2020, ECCV.

[18]  Zhenzhong Chen,et al.  Multiple Video Frame Interpolation via Enhanced Deformable Separable Convolution , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Bohyung Han,et al.  Channel Attention Is All You Need for Video Frame Interpolation , 2020, AAAI.

[20]  Norimichi Ukita,et al.  Space-Time-Aware Multi-Resolution Video Enhancement , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jan P. Allebach,et al.  Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Munchurl Kim,et al.  FISR: Deep Joint Frame Interpolation and Super-Resolution with A Multi-scale Temporal Loss , 2019, AAAI.

[23]  Qian Yin,et al.  Quadratic video interpolation , 2019, NeurIPS.

[24]  Taeoh Kim,et al.  AdaCoF: Adaptive Collaboration of Flows for Video Frame Interpolation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Yung-Yu Chuang,et al.  Deep Video Frame Interpolation Using Cyclic Frame Generation , 2019, AAAI.

[26]  Tomer Peleg,et al.  IM-Net for High Resolution Video Frame Interpolation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Xiaoyun Zhang,et al.  Depth-Aware Video Frame Interpolation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Chao-Yuan Wu,et al.  Video Compression through Image Interpolation , 2018, ECCV.

[29]  Markus H. Gross,et al.  PhaseNet for Video Frame Interpolation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Feng Liu,et al.  Context-Aware Synthesis for Video Frame Interpolation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Jan Kautz,et al.  Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  W. Freeman,et al.  Video Enhancement with Task-Oriented Flow , 2017, International Journal of Computer Vision.

[33]  Frank Hutter,et al.  Fixing Weight Decay Regularization in Adam , 2017, ArXiv.

[34]  Feng Liu,et al.  Video Frame Interpolation via Adaptive Separable Convolution , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Feng Liu,et al.  Video Frame Interpolation via Adaptive Convolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Raymond A. Yeh,et al.  Video Frame Synthesis Using Deep Voxel Flow , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[37]  Jitendra Malik,et al.  View Synthesis by Appearance Flow , 2016, ECCV.

[38]  Max Grosse,et al.  Phase-based frame interpolation for video , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Mubarak Shah,et al.  UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[40]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[41]  Jimmy S. J. Ren,et al.  Deep Bayesian Video Frame Interpolation , 2022, ECCV.

[42]  Luc Van Gool,et al.  European conference on computer vision (ECCV) , 2006, eccv 2006.

[43]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .