Events-To-Video: Bringing Modern Computer Vision to Event Cameras

Event cameras are novel sensors that report brightness changes in the form of asynchronous "events" instead of intensity frames. They have significant advantages over conventional cameras: high temporal resolution, high dynamic range, and no motion blur. Since the output of event cameras is fundamentally different from conventional cameras, it is commonly accepted that they require the development of specialized algorithms to accommodate the particular nature of events. In this work, we take a different view and propose to apply existing, mature computer vision techniques to videos reconstructed from event data. We propose a novel, recurrent neural network to reconstruct videos from a stream of events and train it on a large amount of simulated event data. Our experiments show that our approach surpasses state-of-the-art reconstruction methods by a large margin (> 20%) in terms of image quality. We further apply off-the-shelf computer vision algorithms to videos reconstructed from event data on tasks such as object classification and visual-inertial odometry, and show that this strategy consistently outperforms algorithms that were specifically designed for event data. We believe that our approach opens the door to bringing the outstanding properties of event cameras to an entirely new range of tasks.

[1]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[2]  Davide Scaramuzza,et al.  ESIM: an Open Event Camera Simulator , 2018, CoRL.

[3]  Kostas Daniilidis,et al.  Unsupervised Event-Based Optical Flow Using Motion Compensation , 2018, ECCV Workshops.

[4]  Ryad Benosman,et al.  HATS: Histograms of Averaged Time Surfaces for Robust Event-Based Object Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[6]  Nick Barnes,et al.  Continuous-time Intensity Estimation Using Event Cameras , 2018, ACCV.

[7]  Tobi Delbrück,et al.  The event-camera dataset and simulator: Event-based data for pose estimation, visual odometry, and SLAM , 2016, Int. J. Robotics Res..

[8]  CarloneLuca,et al.  On-Manifold Preintegration for Real-Time Visual--Inertial Odometry , 2017 .

[9]  Thomas Pock,et al.  Real-Time Intensity-Image Reconstruction for Event Cameras Using Manifold Regularisation , 2016, International Journal of Computer Vision.

[10]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[11]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[12]  Davide Scaramuzza,et al.  A Benchmark Comparison of Monocular Visual-Inertial Odometry Algorithms for Flying Robots , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[13]  Tobi Delbrück,et al.  A Low Power, Fully Event-Based Gesture Recognition System , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[15]  Davide Scaramuzza,et al.  Low-latency visual odometry using event-based feature tracks , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Ashok Veeraraghavan,et al.  Direct face detection and video reconstruction from event cameras , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[18]  Shaojie Shen,et al.  VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator , 2017, IEEE Transactions on Robotics.

[19]  Kostas Daniilidis,et al.  Event-Based Visual Inertial Odometry , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  T. Delbruck,et al.  > Replace This Line with Your Paper Identification Number (double-click Here to Edit) < 1 , 2022 .

[21]  Davide Scaramuzza,et al.  Real-time Visual-Inertial Odometry for Event Cameras using Keyframe-based Nonlinear Optimization , 2017, BMVC.

[22]  Roland Siegwart,et al.  Keyframe-Based Visual-Inertial SLAM using Nonlinear Optimization , 2013, Robotics: Science and Systems.

[23]  Tom Drummond,et al.  Event-Based Motion Segmentation by Motion Compensation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[25]  Kostas Daniilidis,et al.  EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras , 2018, Robotics: Science and Systems.

[26]  Vijay Kumar,et al.  The Multivehicle Stereo Event Camera Dataset: An Event Camera Dataset for 3D Perception , 2018, IEEE Robotics and Automation Letters.

[27]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[28]  Zhengqi Li,et al.  MegaDepth: Learning Single-View Depth Prediction from Internet Photos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Stergios I. Roumeliotis,et al.  A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[30]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[31]  Tobi Delbrück,et al.  Real-time, high-speed video decompression using a frame- and event-based DAVIS sensor , 2014, 2014 IEEE International Symposium on Circuits and Systems (ISCAS).

[32]  Tobi Delbruck,et al.  A 240 × 180 130 dB 3 µs Latency Global Shutter Spatiotemporal Vision Sensor , 2014, IEEE Journal of Solid-State Circuits.

[33]  Roland Siegwart,et al.  Robust visual inertial odometry using a direct EKF-based approach , 2015, IROS 2015.

[34]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[35]  Gregory Cohen,et al.  Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades , 2015, Front. Neurosci..

[36]  Chiara Bartolozzi,et al.  Event-Based Visual Flow , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[37]  Kostas Daniilidis,et al.  Event-based feature tracking with probabilistic data association , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[38]  Davide Scaramuzza,et al.  A Tutorial on Quantitative Trajectory Evaluation for Visual(-Inertial) Odometry , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[39]  Yi Zhou,et al.  Semi-Dense 3D Reconstruction with a Stereo Event Camera , 2018, ECCV.

[40]  Tobi Delbrück,et al.  A 128$\times$ 128 120 dB 15 $\mu$s Latency Asynchronous Temporal Contrast Vision Sensor , 2008, IEEE Journal of Solid-State Circuits.

[41]  Frank Dellaert,et al.  On-Manifold Preintegration for Real-Time Visual--Inertial Odometry , 2015, IEEE Transactions on Robotics.

[42]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Min Liu,et al.  Adaptive Time-Slice Block-Matching Optical Flow Algorithm for Dynamic Vision Sensors , 2018, BMVC.

[44]  Lindsay Kleeman,et al.  Simultaneous Optical Flow and Segmentation (SOFAS) using Dynamic Vision Sensor , 2018, ICRA 2018.

[45]  Garrick Orchard,et al.  HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Ryad Benosman,et al.  Simultaneous Mosaicing and Tracking with an Event Camera , 2014, BMVC.

[47]  Davide Scaramuzza,et al.  EVO: A Geometric Approach to Event-Based 6-DOF Parallel Tracking and Mapping in Real Time , 2017, IEEE Robotics and Automation Letters.

[48]  Davide Scaramuzza,et al.  Event-Based, 6-DOF Camera Tracking from Photometric Depth Maps , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Davide Scaramuzza,et al.  Event-based, 6-DOF pose tracking for high-speed maneuvers , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[50]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[51]  Davide Scaramuzza,et al.  Ultimate SLAM? Combining Events, Images, and IMU for Robust Visual SLAM in HDR and High-Speed Scenarios , 2017, IEEE Robotics and Automation Letters.

[52]  Tobi Delbrück,et al.  A pencil balancing robot using a pair of AER dynamic vision sensors , 2009, 2009 IEEE International Symposium on Circuits and Systems.

[53]  Stefan Leutenegger,et al.  Simultaneous Optical Flow and Intensity Estimation from an Event Camera , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Nitish V. Thakor,et al.  HFirst: A Temporal Approach to Object Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Davide Scaramuzza,et al.  Asynchronous, Photometric Feature Tracking using Events and Frames , 2018, ECCV.

[57]  Matthew Cook,et al.  Interacting maps for fast visual interpretation , 2011, The 2011 International Joint Conference on Neural Networks.

[58]  Stefan Leutenegger,et al.  Real-Time 3D Reconstruction and 6-DoF Tracking with an Event Camera , 2016, ECCV.

[59]  Bernabé Linares-Barranco,et al.  Mapping from Frame-Driven to Frame-Free Event-Driven Vision Systems by Low-Rate Rate Coding and Coincidence Processing--Application to Feedforward ConvNets , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60]  Narciso García,et al.  Event-Based Vision Meets Deep Learning on Steering Prediction for Self-Driving Cars , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[61]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.