DART: Distribution Aware Retinal Transform for Event-Based Cameras

We introduce a generic visual descriptor, termed as distribution aware retinal transform (DART), that encodes the structural context using log-polar grids for event cameras. The DART descriptor is applied to four different problems, namely object classification, tracking, detection and feature matching: (1) The DART features are directly employed as local descriptors in a bag-of-words classification framework and testing is carried out on four standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS, NCaltech-101); (2) Extending the classification system, tracking is demonstrated using two key novelties: (i) Statistical bootstrapping is leveraged with online learning for overcoming the low-sample problem during the one-shot learning of the tracker, (ii) Cyclical shifts are induced in the log-polar domain of the DART descriptor to achieve robustness to object scale and rotation variations; (3) To solve the long-term object tracking problem, an object detector is designed using the principle of cluster majority voting. The detection scheme is then combined with the tracker to result in a high intersection-over-union score with augmented ground truth annotations on the publicly available event camera dataset; (4) Finally, the event context encoded by DART greatly simplifies the feature correspondence problem, especially for spatio-temporal slices far apart in time, which has not been explicitly tackled in the event-based vision domain.

[1]  James L. Crowley,et al.  A Representation for Shape Based on Peaks and Ridges in the Difference of Low-Pass Transform , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Hong Yang,et al.  PCA-RECT: An Energy-efficient Object Detection Approach for Event Cameras , 2018, ACCV Workshops.

[3]  Ryad Benosman,et al.  HATS: Histograms of Averaged Time Surfaces for Robust Event-Based Object Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Chris Chatwin,et al.  Position, rotation, scale, and orientation invariant multiple object recognition from cluttered scenes , 2005, SPIE Defense + Commercial Sensing.

[6]  Daniel Matolin,et al.  An asynchronous time-based image sensor , 2008, 2008 IEEE International Symposium on Circuits and Systems.

[7]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[8]  Garrick Orchard,et al.  Spike context: A neuromorphic descriptor for pattern recognition , 2017, 2017 IEEE Biomedical Circuits and Systems Conference (BioCAS).

[9]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[10]  Lin Lin,et al.  Biologically Inspired Composite Vision System for Multiple Depth-of-field Vehicle Tracking and Speed Detection , 2014, ACCV Workshops.

[11]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  Sio-Hoi Ieng,et al.  Spatiotemporal features for asynchronous event-based data , 2015, Front. Neurosci..

[13]  Huchuan Lu,et al.  Saliency Detection via Absorbing Markov Chain , 2013, 2013 IEEE International Conference on Computer Vision.

[14]  Nitish V. Thakor,et al.  HFirst: A Temporal Approach to Object Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Garrick Orchard,et al.  Skimming Digits: Neuromorphic Classification of Spike-Encoded Images , 2016, Front. Neurosci..

[16]  Garrick Orchard,et al.  HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Cheng Xiang,et al.  Unseen object categorization using multiple visual cues , 2017, Neurocomputing.

[18]  Tobi Delbrück,et al.  Combined frame- and event-based detection and tracking , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[19]  Enrico Grosso,et al.  Active face recognition with a hybrid approach , 1997, Pattern Recognit. Lett..

[20]  Huchuan Lu,et al.  Visual saliency detection based on Bayesian model , 2011, 2011 18th IEEE International Conference on Image Processing.

[21]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22]  T. H. Lee,et al.  Multiple object cues for high performance vector quantization , 2017, Pattern Recognit..

[23]  Bernabé Linares-Barranco,et al.  Feedforward Categorization on AER Motion Events Using Cortex-Like Features in a Spiking Neural Network , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Richard A. Messner,et al.  An image processing architecture for real time generation of scale and rotation invariant patterns , 1985, Comput. Vis. Graph. Image Process..

[25]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Shih-Chii Liu,et al.  Effective sensor fusion with event-based sensors and deep network architectures , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[27]  V. Tagliasco,et al.  A model of the early stages of the human visual system: Functional and topological transformations performed in the peripheral visual field , 1982, Biological Cybernetics.

[28]  Luping Shi,et al.  Classification of Spatiotemporal Events Based on Random Forest , 2016, BICS.

[29]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[30]  Davide Scaramuzza,et al.  EVO: A Geometric Approach to Event-Based 6-DOF Parallel Tracking and Mapping in Real Time , 2017, IEEE Robotics and Automation Letters.

[31]  Tong Heng Lee,et al.  Shape classification using invariant features and contextual information in the bag-of-words model , 2015, Pattern Recognit..

[32]  Margarita Chli,et al.  Asynchronous Corner Detection and Tracking for Event Cameras in Real Time , 2018, IEEE Robotics and Automation Letters.

[33]  Carl F. R. Weiman,et al.  Logarithmic spiral grids for image-processing and display , 1979 .

[34]  Lipo Wang Support vector machines : theory and applications , 2005 .

[35]  Tobi Delbrück,et al.  Training Deep Spiking Neural Networks Using Backpropagation , 2016, Front. Neurosci..

[36]  Iasonas Kokkinos,et al.  Scale invariance without scale selection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[39]  Andrew Zisserman,et al.  Efficient additive kernels via explicit feature maps , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  Ryad Benosman,et al.  Asynchronous Event-Based Multikernel Algorithm for High-Speed Visual Features Tracking , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[41]  Qiang Wang,et al.  ROBUST DENSE DEPTH MAP ESTIMATION FROM SPARSE DVS STEREOS 3 2 Related Work , 2017 .

[42]  Ryad Benosman,et al.  Event-based Dynamic Face Detection and Tracking Based on Activity , 2018, ArXiv.

[43]  John K. Tsotsos,et al.  50 Years of object recognition: Directions forward , 2013, Comput. Vis. Image Underst..

[44]  Tobi Delbrück,et al.  The event-camera dataset and simulator: Event-based data for pose estimation, visual odometry, and SLAM , 2016, Int. J. Robotics Res..

[45]  Eli Shechtman,et al.  Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[48]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Bo Zhao,et al.  Bag of Events: An Efficient Probability-Based Feature Extraction Method for AER Image Sensors , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[50]  Ehud Rivlin,et al.  Robust Fragments-based Tracking using the Integral Histogram , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[51]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Richard I. Hartley,et al.  Optimised KD-trees for fast image descriptor matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  Wenyu Liu,et al.  Bag of contour fragments for robust shape classification , 2014, Pattern Recognit..

[54]  E. L. Schwartz,et al.  Spatial mapping in the primate sensory projection: Analytic structure and relevance to perception , 1977, Biological Cybernetics.

[55]  Michael J. Black,et al.  EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.

[56]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[57]  Chiara Bartolozzi,et al.  An Asynchronous Neuromorphic Event-Driven Visual Part-Based Shape Tracking , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[58]  DH Hubel,et al.  Psychophysical evidence for separate channels for the perception of form, color, movement, and depth , 1987, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[59]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[60]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[61]  Cor J. Veenman,et al.  Kernel Codebooks for Scene Categorization , 2008, ECCV.

[62]  Chiara Bartolozzi,et al.  Fast Event-based Corner Detection , 2017, BMVC.

[63]  Lipo Wang,et al.  Support Vector Machines: Theory and Applications (Studies in Fuzziness and Soft Computing) , 2005 .

[64]  Gregory Cohen,et al.  The ripple pond: enabling spiking networks to see , 2013, Front. Neurosci..

[65]  Massimo Tistarelli,et al.  Active/space-variant object recognition , 1995, Image Vis. Comput..

[66]  Sebastian Nowozin,et al.  Optimal Decisions from Probabilistic Models: The Intersection-over-Union Case , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[67]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[68]  Tobi Delbruck,et al.  A 240 × 180 130 dB 3 µs Latency Global Shutter Spatiotemporal Vision Sensor , 2014, IEEE Journal of Solid-State Circuits.

[69]  David J. Fleet,et al.  Robust Online Appearance Models for Visual Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[70]  Giulio Sandini,et al.  An anthropomorphic retina-like structure for scene analysis , 1980 .

[71]  Serge J. Belongie,et al.  Context based object categorization: A critical survey , 2010, Comput. Vis. Image Underst..

[72]  Gregory Cohen,et al.  Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades , 2015, Front. Neurosci..

[73]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[74]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[75]  Enrico Grosso,et al.  Log-polar Stereo for Anthropomorphic Robots , 2000, ECCV.

[76]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[77]  Shai Avidan,et al.  Support vector tracking , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[78]  Kostas Daniilidis,et al.  Event-based feature tracking with probabilistic data association , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[79]  Arindam Basu,et al.  Low-Power, Adaptive Neuromorphic Systems: Recent Progress and Future Directions , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[80]  Vibhav Vineet,et al.  Struck: Structured Output Tracking with Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.