ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring

Video deblurring models exploit consecutive frames to remove blurs from camera shakes and object motions. In order to utilize neighboring sharp patches, typical methods rely mainly on homography or optical flows to spatially align neighboring blurry frames. However, such explicit approaches are less effective in the presence of fast motions with large pixel displacements. In this work, we propose a novel implicit method to learn spatial correspondence among blurry frames in the feature space. To construct distant pixel correspondences, our model builds a correlation volume pyramid among all the pixel-pairs between neigh-boring frames. To enhance the features of the reference frame, we design a correlative aggregation module that maximizes the pixel-pair correlations with its neighbors based on the volume pyramid. Finally, we feed the aggregated features into a reconstruction module to obtain the restored frame. We design a generative adversarial paradigm to optimize the model progressively. Our proposed method is evaluated on the widely-adopted DVD dataset, along with a newly collected High-Frame-Rate (1000 fps) Dataset for Video Deblurring (HFR-DVD). Quantitative and qualitative experiments show that our model performs favorably on both datasets against previous state-of-the-art methods, confirming the benefit of modeling all-range spatial correspondence for video deblurring.

[1]  Hao Wei,et al.  Deep Video Deblurring Using Sharpness Features From Exemplars , 2020, IEEE Transactions on Image Processing.

[2]  Yanning Zhang,et al.  Multi-image Blind Deblurring Using a Coupled Adaptive Sparse Prior , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Piotr Koniusz,et al.  Power Normalizations in Fine-Grained Image, Few-Shot Image and Graph Classification , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Xiaoyong Shen,et al.  Dynamic Scene Deblurring With Parameter Selective Sharing and Nested Skip Connections , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Ian D. Reid,et al.  From Motion Blur to Motion Flow: A Deep Learning Solution for Removing Heterogeneous Motion Blur , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Bernhard Schölkopf,et al.  Online Video Deblurring via Dynamic Temporal Blending Network , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[8]  Li Zhang,et al.  Optical flow in the presence of spatially-varying motion blur , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Deqing Sun,et al.  Blind Image Deblurring Using Dark Channel Prior , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Seoung Wug Oh,et al.  Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Daniel P. Huttenlocher,et al.  Generating sharp panoramas from motion-blurred videos , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Hongdong Li,et al.  Every Moment Matters: Detail-Aware Networks to Bring a Blurry Image Alive , 2020, ACM Multimedia.

[13]  Miaomiao Liu,et al.  Single Image Deblurring and Camera Motion Estimation With Depth Map , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[14]  Lars Petersson,et al.  Transferring Cross-Domain Knowledge for Video Sign Language Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[16]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Jan Kautz,et al.  PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Miaomiao Liu,et al.  Joint Stereo Video Deblurring, Scene Flow Estimation and Moving Object Segmentation , 2019, IEEE Transactions on Image Processing.

[19]  Jean Ponce,et al.  Learning a convolutional neural network for non-uniform motion blur removal , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[21]  Anoop Cherian,et al.  Tensor Representations for Action Recognition , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Jiri Matas,et al.  DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Xin Yu,et al.  Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[24]  Xin Yu,et al.  Efficient Patch-Wise Non-Uniform Deblurring for a Single Image , 2014, IEEE Transactions on Multimedia.

[25]  Alex Kendall,et al.  End-to-End Learning of Geometry and Context for Deep Stereo Regression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[26]  Rob Fergus,et al.  Fast Image Deconvolution using Hyper-Laplacian Priors , 2009, NIPS.

[27]  Mehrtash Harandi,et al.  Hierarchical Neural Architecture Search for Deep Stereo Matching , 2020, NeurIPS.

[28]  Chen Change Loy,et al.  EDVR: Video Restoration With Enhanced Deformable Convolutional Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[29]  Hongdong Li,et al.  Deblurring by Realistic Blurring , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Yi Wang,et al.  Scale-Recurrent Network for Deep Image Deblurring , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Richard Hartley,et al.  Phase-Only Image Based Kernel Estimation for Single Image Blind Deblurring , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Hongdong Li,et al.  Stereo Computation for a Single Mixture Image , 2018, ECCV.

[34]  Jinhui Tang,et al.  Cascaded Deep Video Deblurring Using Temporal Sharpness Prior , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Lei Wang,et al.  Few-Shot Object Detection by Second-Order Pooling , 2020, ACCV.

[36]  Kiriakos N. Kutulakos,et al.  Depth from Defocus in the Wild , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Radu Timofte,et al.  NTIRE 2019 Challenge on Video Deblurring and Super-Resolution: Dataset and Study , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[38]  Piotr Koniusz,et al.  Simple Spectral Graph Convolution , 2021, ICLR.

[39]  Dongxu Li,et al.  TSPNet: Hierarchical Feature Learning via Temporal Semantic Pyramid for Sign Language Translation , 2020, NeurIPS.

[40]  Marc Pollefeys,et al.  Learning to Fuse Proposals from Multiple Scanline Optimizations in Semi-Global Matching , 2018, ECCV.

[41]  Hongdong Li,et al.  Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Hongdong Li,et al.  Adversarial Spatio-Temporal Learning for Video Deblurring , 2018, IEEE Transactions on Image Processing.

[43]  Seungyong Lee,et al.  Video deblurring for hand-held cameras using patch-based synthesis , 2012, ACM Trans. Graph..

[44]  Guillermo Sapiro,et al.  Deep Video Deblurring for Hand-Held Cameras , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Wangmeng Zuo,et al.  Spatio-Temporal Filter Adaptive Network for Video Deblurring , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[46]  Jia Deng,et al.  RAFT: Recurrent All-Pairs Field Transforms for Optical Flow , 2020, ECCV.

[47]  Pascal Vasseur,et al.  A Branch-and-Bound Approach to Correspondence and Grouping Problems , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Jiaya Jia,et al.  High-quality motion deblurring from a single image , 2008, ACM Trans. Graph..

[49]  Rynson W. H. Lau,et al.  Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50]  Haichao Zhang,et al.  Multi-shot Imaging: Joint Alignment, Deblurring, and Resolution-Enhancement , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  Harry Shum,et al.  Full-frame video stabilization with motion inpainting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Tae Hyun Kim,et al.  Generalized video deblurring for dynamic scenes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Fatih Murat Porikli,et al.  Simultaneous Stereo Video Deblurring and Scene Flow Estimation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Thomas Brox,et al.  A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Yunchao Wei,et al.  Content-Consistent Matching for Domain Adaptive Semantic Segmentation , 2020, ECCV.

[56]  Hongdong Li,et al.  Displacement-Invariant Matching Cost Learning for Accurate Optical Flow Estimation , 2020, NeurIPS.