Learning Event-Driven Video Deblurring and Interpolation

Event-based sensors, which have a response if the change of pixel intensity exceeds a triggering threshold, can capture high-speed motion with microsecond accuracy. Assisted by an event camera, we can generate high frame-rate sharp videos from low frame-rate blurry ones captured by an intensity camera. In this paper, we propose an effective event-driven video deblurring and interpolation algorithm based on deep convolutional neural networks (CNNs). Motivated by the physical model that the residuals between a blurry image and sharp frames are the integrals of events, the proposed network uses events to estimate the residuals for the sharp frame restoration. As the triggering threshold varies spatially, we develop an effective method to estimate dynamic filters to solve this problem. To utilize the temporal information, the sharp frames restored from the previous blurry frame are also considered. The proposed algorithm achieves superior performance against state-ofthe-art methods on both synthetic and real datasets.

[1]  Tobi Delbrück,et al.  A 128$\times$ 128 120 dB 15 $\mu$s Latency Asynchronous Temporal Contrast Vision Sensor , 2008, IEEE Journal of Solid-State Circuits.

[2]  Paolo Favaro,et al.  Learning to Extract Flawless Slow Motion From Blurry Videos , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Markus H. Gross,et al.  PhaseNet for Video Frame Interpolation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Zhiyong Gao,et al.  MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Tae Hyun Kim,et al.  Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Christian Peter Brändli Event-Based Machine Vision , 2015 .

[7]  Stefan Leutenegger,et al.  Simultaneous Optical Flow and Intensity Estimation from an Event Camera , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Prasan A. Shedligeri,et al.  Photorealistic image reconstruction from hybrid intensity and event-based sensor , 2018, J. Electronic Imaging.

[9]  Vladlen Koltun,et al.  Events-To-Video: Bringing Modern Computer Vision to Event Cameras , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  T. Delbruck,et al.  > Replace This Line with Your Paper Identification Number (double-click Here to Edit) < 1 , 2022 .

[11]  Guillermo Sapiro,et al.  Deep Video Deblurring for Hand-Held Cameras , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Davide Scaramuzza,et al.  ESIM: an Open Event Camera Simulator , 2018, CoRL.

[13]  Meiguang Jin,et al.  Learning to Extract a Video Sequence from a Single Motion-Blurred Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Chen Change Loy,et al.  EDVR: Video Restoration With Enhanced Deformable Convolutional Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[15]  Thomas Pock,et al.  Real-Time Intensity-Image Reconstruction for Event Cameras Using Manifold Regularisation , 2016, International Journal of Computer Vision.

[16]  Jonathan T. Barron,et al.  Burst Denoising with Kernel Prediction Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Xin Yu,et al.  Bringing a Blurry Frame Alive at High Frame-Rate With an Event Camera , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Tobi Delbrück,et al.  Real-time, high-speed video decompression using a frame- and event-based DAVIS sensor , 2014, 2014 IEEE International Symposium on Circuits and Systems (ISCAS).

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Dongqing Zou,et al.  Learning Event-Based Motion Deblurring , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Yo-Sung Ho,et al.  Event-Based High Dynamic Range Image and Very High Frame Rate Video Generation Using Conditional Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Ashok Veeraraghavan,et al.  Direct face detection and video reconstruction from event cameras , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[23]  Feng Liu,et al.  Video Frame Interpolation via Adaptive Separable Convolution , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Tobi Delbruck,et al.  A 240 × 180 130 dB 3 µs Latency Global Shutter Spatiotemporal Vision Sensor , 2014, IEEE Journal of Solid-State Circuits.

[25]  Hongdong Li,et al.  Deep Stacked Hierarchical Multi-Patch Network for Image Deblurring , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Luc Van Gool,et al.  Dynamic Filter Networks , 2016, NIPS.

[27]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[28]  Wangmeng Zuo,et al.  Spatio-Temporal Filter Adaptive Network for Video Deblurring , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Nick Barnes,et al.  Continuous-time Intensity Estimation Using Event Cameras , 2018, ACCV.