DART: Distribution Aware Retinal Transform for Event-Based Cameras

We introduce a generic visual descriptor, termed as distribution aware retinal transform (DART), that encodes the structural context using log-polar grids for event cameras. The DART descriptor is applied to four different problems, namely object classification, tracking, detection and feature matching: (1) The DART features are directly employed as local descriptors in a bag-of-words classification framework and testing is carried out on four standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS, NCaltech-101); (2) Extending the classification system, tracking is demonstrated using two key novelties: (i) Statistical bootstrapping is leveraged with online learning for overcoming the low-sample problem during the one-shot learning of the tracker, (ii) Cyclical shifts are induced in the log-polar domain of the DART descriptor to achieve robustness to object scale and rotation variations; (3) To solve the long-term object tracking problem, an object detector is designed using the principle of cluster majority voting. The detection scheme is then combined with the tracker to result in a high intersection-over-union score with augmented ground truth annotations on the publicly available event camera dataset; (4) Finally, the event context encoded by DART greatly simplifies the feature correspondence problem, especially for spatio-temporal slices far apart in time, which has not been explicitly tackled in the event-based vision domain.

[1]  Hong Yang,et al.  PCA-RECT: An Energy-efficient Object Detection Approach for Event Cameras , 2018, ACCV Workshops.

[2]  Margarita Chli,et al.  Asynchronous Corner Detection and Tracking for Event Cameras in Real Time , 2018, IEEE Robotics and Automation Letters.

[3]  Ryad Benosman,et al.  Event-based Dynamic Face Detection and Tracking Based on Activity , 2018, ArXiv.

[4]  Ryad Benosman,et al.  HATS: Histograms of Averaged Time Surfaces for Robust Event-Based Object Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Arindam Basu,et al.  Low-Power, Adaptive Neuromorphic Systems: Recent Progress and Future Directions , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[6]  Garrick Orchard,et al.  Spike context: A neuromorphic descriptor for pattern recognition , 2017, 2017 IEEE Biomedical Circuits and Systems Conference (BioCAS).

[7]  Chiara Bartolozzi,et al.  Fast Event-based Corner Detection , 2017, BMVC.

[8]  Garrick Orchard,et al.  HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  T. H. Lee,et al.  Multiple object cues for high performance vector quantization , 2017, Pattern Recognit..

[10]  Kostas Daniilidis,et al.  Event-based feature tracking with probabilistic data association , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Bo Zhao,et al.  Bag of Events: An Efficient Probability-Based Feature Extraction Method for AER Image Sensors , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Davide Scaramuzza,et al.  EVO: A Geometric Approach to Event-Based 6-DOF Parallel Tracking and Mapping in Real Time , 2017, IEEE Robotics and Automation Letters.

[13]  Cheng Xiang,et al.  Unseen object categorization using multiple visual cues , 2017, Neurocomputing.

[14]  Luping Shi,et al.  Classification of Spatiotemporal Events Based on Random Forest , 2016, BICS.

[15]  Tobi Delbrück,et al.  The event-camera dataset and simulator: Event-based data for pose estimation, visual odometry, and SLAM , 2016, Int. J. Robotics Res..

[16]  Vibhav Vineet,et al.  Struck: Structured Output Tracking with Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Tobi Delbrück,et al.  Training Deep Spiking Neural Networks Using Backpropagation , 2016, Front. Neurosci..

[18]  Shih-Chii Liu,et al.  Effective sensor fusion with event-based sensors and deep network architectures , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[19]  Tobi Delbrück,et al.  Combined frame- and event-based detection and tracking , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[20]  Garrick Orchard,et al.  Skimming Digits: Neuromorphic Classification of Spike-Encoded Images , 2016, Front. Neurosci..

[21]  Bernabé Linares-Barranco,et al.  Feedforward Categorization on AER Motion Events Using Cortex-Like Features in a Spiking Neural Network , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Nitish V. Thakor,et al.  HFirst: A Temporal Approach to Object Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Ryad Benosman,et al.  Asynchronous Event-Based Multikernel Algorithm for High-Speed Visual Features Tracking , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Gregory Cohen,et al.  Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades , 2015, Front. Neurosci..

[25]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Chiara Bartolozzi,et al.  An Asynchronous Neuromorphic Event-Driven Visual Part-Based Shape Tracking , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[27]  Tong Heng Lee,et al.  Shape classification using invariant features and contextual information in the bag-of-words model , 2015, Pattern Recognit..

[28]  Sio-Hoi Ieng,et al.  Spatiotemporal features for asynchronous event-based data , 2015, Front. Neurosci..

[29]  Lin Lin,et al.  Biologically Inspired Composite Vision System for Multiple Depth-of-field Vehicle Tracking and Speed Detection , 2014, ACCV Workshops.

[30]  Tobi Delbruck,et al.  A 240 × 180 130 dB 3 µs Latency Global Shutter Spatiotemporal Vision Sensor , 2014, IEEE Journal of Solid-State Circuits.

[31]  Sebastian Nowozin,et al.  Optimal Decisions from Probabilistic Models: The Intersection-over-Union Case , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Wenyu Liu,et al.  Bag of contour fragments for robust shape classification , 2014, Pattern Recognit..

[33]  Huchuan Lu,et al.  Saliency Detection via Absorbing Markov Chain , 2013, 2013 IEEE International Conference on Computer Vision.

[34]  John K. Tsotsos,et al.  50 Years of object recognition: Directions forward , 2013, Comput. Vis. Image Underst..

[35]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Gregory Cohen,et al.  The ripple pond: enabling spiking networks to see , 2013, Front. Neurosci..

[37]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  A. Vedaldi,et al.  Efficient Additive Kernels via Explicit Feature Maps , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Huchuan Lu,et al.  Visual saliency detection based on Bayesian model , 2011, 2011 18th IEEE International Conference on Image Processing.

[40]  Serge J. Belongie,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[43]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[44]  C. Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[45]  Serge J. Belongie,et al.  Context based object categorization: A critical survey , 2010, Comput. Vis. Image Underst..

[46]  A. Smeulders,et al.  Kernel Codebooks for Scene Categorization , 2008, ECCV.

[47]  Richard I. Hartley,et al.  Optimised KD-trees for fast image descriptor matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  A. Yuille,et al.  Scale invariance without scale selection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Daniel Matolin,et al.  An asynchronous time-based image sensor , 2008, 2008 IEEE International Symposium on Circuits and Systems.

[50]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[51]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Eli Shechtman,et al.  Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[54]  Chris Chatwin,et al.  Position-, rotation-, scale-, and orientation-invariant multiple object recognition from cluttered scenes , 2006 .

[55]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[56]  Ehud Rivlin,et al.  Robust Fragments-based Tracking using the Integral Histogram , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[57]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[58]  B. Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[59]  Lipo Wang,et al.  Support Vector Machines: Theory and Applications (Studies in Fuzziness and Soft Computing) , 2005 .

[60]  Shai Avidan,et al.  Support vector tracking , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[61]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[62]  Serge J. Belongie,et al.  Shape Matching and Object Recognition Using Shape Contexts , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[63]  David J. Fleet,et al.  Robust online appearance models for visual tracking , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[64]  Enrico Grosso,et al.  Log-polar Stereo for Anthropomorphic Robots , 2000, ECCV.

[65]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[66]  Enrico Grosso,et al.  Active face recognition with a hybrid approach , 1997, Pattern Recognit. Lett..

[67]  Michael J. Black,et al.  EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.

[68]  Massimo Tistarelli,et al.  Active/space-variant object recognition , 1995, Image Vis. Comput..

[69]  DH Hubel,et al.  Psychophysical evidence for separate channels for the perception of form, color, movement, and depth , 1987, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[70]  Richard A. Messner,et al.  An image processing architecture for real time generation of scale and rotation invariant patterns , 1985, Comput. Vis. Graph. Image Process..

[71]  James L. Crowley,et al.  A Representation for Shape Based on Peaks and Ridges in the Difference of Low-Pass Transform , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[72]  V. Tagliasco,et al.  A model of the early stages of the human visual system: Functional and topological transformations performed in the peripheral visual field , 1982, Biological Cybernetics.

[73]  Giulio Sandini,et al.  An anthropomorphic retina-like structure for scene analysis , 1980 .

[74]  Carl F. R. Weiman,et al.  Logarithmic spiral grids for image-processing and display , 1979 .

[75]  E. L. Schwartz,et al.  Spatial mapping in the primate sensory projection: Analytic structure and relevance to perception , 1977, Biological Cybernetics.

[76]  Qiang Wang,et al.  ROBUST DENSE DEPTH MAP ESTIMATION FROM SPARSE DVS STEREOS 3 2 Related Work , 2017 .

[77]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[78]  Lipo Wang Support vector machines : theory and applications , 2005 .

[79]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[80]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.