Asynchronous Tracking-by-Detection on Adaptive Time Surfaces for Event-based Object Tracking

Event cameras, which are asynchronous bio-inspired vision sensors, have shown great potential in a variety of situations, such as fast motion and low illumination scenes. However, most of the event-based object tracking methods are designed for scenarios with untextured objects and uncluttered backgrounds. There are few event-based object tracking methods that support bounding box-based object tracking. The main idea behind this work is to propose an asynchronous Event-based Tracking-by-Detection (ETD) method for generic bounding box-based object tracking. To achieve this goal, we present an Adaptive Time-Surface with Linear Time Decay (ATSLTD) event-to-frame conversion algorithm, which asynchronously and effectively warps the spatio-temporal information of asynchronous retinal events to a sequence of ATSLTD frames with clear object contours. We feed the sequence of ATSLTD frames to the proposed ETD method to perform accurate and efficient object tracking, which leverages the high temporal resolution property of event cameras. We compare the proposed ETD method with seven popular object tracking methods, that are based on conventional cameras or event cameras, and two variants of ETD. The experimental results show the superiority of the proposed ETD method in handling various challenging environments.

[1]  Benedict Wild,et al.  How Does the Brain Tell Self-Motion from Object Motion? , 2018, The Journal of Neuroscience.

[2]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Stephan Schraml,et al.  Spatiotemporal multiple persons tracking using Dynamic Vision Sensor , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[4]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[5]  T. Delbruck,et al.  > Replace This Line with Your Paper Identification Number (double-click Here to Edit) < 1 , 2022 .

[6]  Tobi Delbrück,et al.  The event-camera dataset and simulator: Event-based data for pose estimation, visual odometry, and SLAM , 2016, Int. J. Robotics Res..

[7]  Tobi Delbruck,et al.  A 240 × 180 130 dB 3 µs Latency Global Shutter Spatiotemporal Vision Sensor , 2014, IEEE Journal of Solid-State Circuits.

[8]  Stephen A. Baccus,et al.  Segregation of object and background motion in the retina , 2003, Nature.

[9]  Kostas Daniilidis,et al.  EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras , 2018, Robotics: Science and Systems.

[10]  Shihao Zhang,et al.  Long-term object tracking with a moving event camera , 2018, BMVC.

[11]  Jiashi Feng,et al.  Robust Visual Object Tracking with Top-down Reasoning , 2017, ACM Multimedia.

[12]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[13]  Min Liu,et al.  Adaptive Time-Slice Block-Matching Optical Flow Algorithm for Dynamic Vision Sensors , 2018, BMVC.

[14]  Kostas Daniilidis,et al.  Event-Based Visual Inertial Odometry , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Dana H. Ballard,et al.  Generalizing the Hough transform to detect arbitrary shapes , 1981, Pattern Recognit..

[16]  Xu Gao,et al.  OSMO: Online Specific Models for Occlusion in Multiple Object Tracking under Surveillance Scene , 2018, ACM Multimedia.

[17]  Davide Scaramuzza,et al.  Asynchronous, Photometric Feature Tracking using Events and Frames , 2018, ECCV.

[18]  Ben L. Murphy-Baum,et al.  An Old Neuron Learns New Tricks: Redefining Motion Processing in the Primate Retina , 2018, Neuron.

[19]  Dorin Comaniciu,et al.  Mean shift and optimal prediction for efficient object tracking , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[20]  Tobi Delbrück,et al.  Combined frame- and event-based detection and tracking , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[21]  C. W. Oyster,et al.  The analysis of image motion by the rabbit retina , 1968, The Journal of physiology.

[22]  Yi Liu,et al.  Robust Correlation Filter Tracking with Shepherded Instance-Aware Proposals , 2018, ACM Multimedia.

[23]  Ryad Benosman,et al.  Event-Driven Stereo Visual Tracking Algorithm to Solve Object Occlusion , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Yiannis Aloimonos,et al.  Event-Based Moving Object Detection and Tracking , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Nick Barnes,et al.  Continuous-time Intensity Estimation Using Event Cameras , 2018, ACCV.

[27]  Daniel Matolin,et al.  A QVGA 143 dB Dynamic Range Frame-Free PWM Image Sensor With Lossless Pixel-Level Video Compression and Time-Domain CDS , 2011, IEEE Journal of Solid-State Circuits.

[28]  David Zhang,et al.  Deep Location-Specific Tracking , 2017, ACM Multimedia.

[29]  Ryad Benosman,et al.  HATS: Histograms of Averaged Time Surfaces for Robust Event-Based Object Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  A.N. Belbachir,et al.  Embedded Vision System for Real-Time Object Tracking using an Asynchronous Transient Vision Sensor , 2006, 2006 IEEE 12th Digital Signal Processing Workshop & 4th IEEE Signal Processing Education Workshop.

[31]  Narciso García,et al.  Event-Based Vision Meets Deep Learning on Steering Prediction for Self-Driving Cars , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Volker Eiselein,et al.  High-Speed tracking-by-detection without using image information , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[33]  Chiara Bartolozzi,et al.  Robust visual tracking with a freely-moving event camera , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[34]  Wei Wu,et al.  Distractor-aware Siamese Networks for Visual Object Tracking , 2018, ECCV.

[35]  Tobi Delbrück,et al.  A 128$\times$ 128 120 dB 15 $\mu$s Latency Asynchronous Temporal Contrast Vision Sensor , 2008, IEEE Journal of Solid-State Circuits.

[36]  Garrick Orchard,et al.  HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Stefan Leutenegger,et al.  Real-Time 3D Reconstruction and 6-DoF Tracking with an Event Camera , 2016, ECCV.

[39]  Davide Scaramuzza,et al.  Event-Based, 6-DOF Camera Tracking from Photometric Depth Maps , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Davide Scaramuzza,et al.  A Unifying Contrast Maximization Framework for Event Cameras, with Applications to Motion, Depth, and Optical Flow Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41]  Michael Felsberg,et al.  ECO: Efficient Convolution Operators for Tracking , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Eduardo Ros,et al.  Real-Time Clustering and Multi-Target Tracking Using Event-Based Sensors , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[43]  Stefan Leutenegger,et al.  Simultaneous Optical Flow and Intensity Estimation from an Event Camera , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Davide Scaramuzza,et al.  EMVS: Event-Based Multi-View Stereo—3D Reconstruction with an Event Camera in Real-Time , 2017, International Journal of Computer Vision.