Blind Video Deflickering by Neural Filtering with a Flawed Atlas

Many videos contain flickering artifacts. Common causes of flicker include video processing algorithms, video generation algorithms, and capturing videos under specific situations. Prior work usually requires specific guidance such as the flickering frequency, manual annotations, or extra consistent videos to remove the flicker. In this work, we propose a general flicker removal framework that only receives a single flickering video as input without additional guidance. Since it is blind to a specific flickering type or guidance, we name this"blind deflickering."The core of our approach is utilizing the neural atlas in cooperation with a neural filtering strategy. The neural atlas is a unified representation for all frames in a video that provides temporal consistency guidance but is flawed in many cases. To this end, a neural network is trained to mimic a filter to learn the consistent features (e.g., color, brightness) and avoid introducing the artifacts in the atlas. To validate our method, we construct a dataset that contains diverse real-world flickering videos. Extensive experiments show that our method achieves satisfying deflickering performance and even outperforms baselines that use extra guidance on a public benchmark.

[1]  Hao Ouyang,et al.  Deep Video Prior for Video Consistency and Propagation , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Jingtan Piao,et al.  High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Jiashi Feng,et al.  MagicVideo: Efficient Video Generation With Latent Diffusion Models , 2022, ArXiv.

[4]  David J. Fleet,et al.  Imagen Video: High Definition Video Generation with Diffusion Models , 2022, ArXiv.

[5]  Yaniv Taigman,et al.  Make-A-Video: Text-to-Video Generation without Text-Video Data , 2022, ICLR.

[6]  Serge J. Belongie,et al.  Text-Driven Stylization of Video Objects , 2022, ECCV Workshops.

[7]  Jia-Bin Huang,et al.  Temporally Consistent Semantic Video Editing , 2022, ECCV.

[8]  X. Wang,et al.  Learning Implicit Feature Alignment Function for Semantic Segmentation , 2022, ECCV.

[9]  Tae Hyun Kim,et al.  Learning Task Agnostic Temporal Consistency Correction , 2022, ArXiv.

[10]  Noah Snavely,et al.  Deformable Sprites for Unsupervised Video Decomposition , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Tali Dekel,et al.  Text2LIVE: Text-Driven Layered Image and Video Editing , 2022, ECCV.

[12]  Dongdong Chen,et al.  Bringing Old Films Back to Life , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Xiaolong Wang,et al.  Look Outside the Room: Synthesizing A Consistent Long-Term 3D Scene Video from A Single Image , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Yu-Kun Lai,et al.  DeepFaceVideoEditing , 2022, ACM Trans. Graph..

[15]  Abhinav Shrivastava,et al.  NeRV: Neural Representations for Videos , 2021, NeurIPS.

[16]  Tali Dekel,et al.  Layered neural atlases for consistent video editing , 2021, ACM Trans. Graph..

[17]  Qifeng Chen,et al.  Neural Camera Simulators , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Xiaolong Wang,et al.  Learning Continuous Image Representation with Local Implicit Image Function , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Zhengqi Li,et al.  Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Qifeng Chen,et al.  Video Deblurring by Fitting to Test Data , 2020, ArXiv.

[21]  Qifeng Chen,et al.  Blind Video Temporal Consistency via Deep Video Prior , 2020, NeurIPS.

[22]  Jonathan T. Barron,et al.  Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains , 2020, NeurIPS.

[23]  Gordon Wetzstein,et al.  Implicit Neural Representations with Periodic Activation Functions , 2020, NeurIPS.

[24]  Qiong Yan,et al.  Polarized Reflection Removal With Perfect Alignment in the Wild , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[26]  Qifeng Chen,et al.  Video Depth Estimation by Fusing Flow-to-Depth Proposals , 2019, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[27]  Minh N. Do,et al.  Seeing Motion in the Dark , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Qifeng Chen,et al.  Fully Automatic Video Colorization With Self-Regularization and Diversity , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Jan van Gemert,et al.  ViDeNN: Deep Blind Video Denoising , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[30]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  N. Thürey,et al.  Temporally Coherent GANs for Video Super-Resolution (TecoGAN) , 2018, ArXiv.

[33]  Ersin Yumer,et al.  Learning Blind Video Temporal Consistency , 2018, ECCV.

[34]  Harshad Rai,et al.  Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks , 2018 .

[35]  Shao-Yi Chien,et al.  Occlusion-aware Video Temporal Consistency , 2017, ACM Multimedia.

[36]  Ali Kanj,et al.  Flicker removal and superpixel-based motion tracking for high speed videos , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[37]  Vladlen Koltun,et al.  Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[38]  Jonathan T. Barron,et al.  Deep bilateral learning for real-time image enhancement , 2017, ACM Trans. Graph..

[39]  Ming-Hsuan Yang,et al.  Universal Style Transfer via Feature Transforms , 2017, NIPS.

[40]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Alexander Sorkine-Hornung,et al.  Bilateral Space Video Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Luc Van Gool,et al.  A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Alexei A. Efros,et al.  Colorful Image Colorization , 2016, ECCV.

[44]  Mohinder Malhotra Single Image Haze Removal Using Dark Channel Prior , 2016 .

[45]  Sylvain Paris,et al.  Blind video temporal consistency , 2015, ACM Trans. Graph..

[46]  Noah Snavely,et al.  Intrinsic images in the wild , 2014, ACM Trans. Graph..

[47]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[48]  Younghui Kim,et al.  Video Panorama for 2D to 3D Conversion , 2012, Comput. Graph. Forum.

[49]  Markus Gross,et al.  Practical temporal consistency for image-based graphics applications , 2012, ACM Trans. Graph..

[50]  Aljoscha Smolic,et al.  2D to 3D conversion of sports content using panoramas , 2011, 2011 18th IEEE International Conference on Image Processing.

[51]  Julie Delon,et al.  Stabilization of Flicker-Like Effects in Image Sequences through Local Contrast Correction , 2010, SIAM J. Imaging Sci..

[52]  Carlos D. Correa,et al.  Dynamic video narratives , 2010, ACM Trans. Graph..

[53]  Adam Finkelstein,et al.  Video tapestries with continuous temporal zoom , 2010, ACM Trans. Graph..

[54]  Frédo Durand,et al.  Light mixture estimation for spatially varying white balance , 2008, ACM Trans. Graph..

[55]  Andrew W. Fitzgibbon,et al.  Unwrap mosaics: a new representation for video editing , 2008, ACM Trans. Graph..

[56]  Matthew A. Brown,et al.  Automatic Panoramic Image Stitching using Invariant Features , 2007, International Journal of Computer Vision.

[57]  David Salesin,et al.  Panoramic video textures , 2005, ACM Trans. Graph..

[58]  Michal Irani,et al.  Video indexing based on mosaic representations , 1998, Proc. IEEE.

[59]  P. Anandan,et al.  Mosaic based representations of video sequences and their applications , 1995, Proceedings of IEEE International Conference on Computer Vision.