论文信息 - Blending of Learning-based Tracking and Object Detection for Monocular Camera-based Target Following

Blending of Learning-based Tracking and Object Detection for Monocular Camera-based Target Following

Deep learning has recently started being applied to visual tracking of generic objects in video streams. For the purposes of robotics applications, it is very important for a target tracker to recover its track if it is lost due to heavy or prolonged occlusions or motion blur of the target. We present a real-time approach which fuses a generic target tracker and object detection module with a target re-identification module. Our work focuses on improving the performance of Convolutional Recurrent Neural Network-based object trackers in cases where the object of interest belongs to the category of \emph{familiar} objects. Our proposed approach is sufficiently lightweight to track objects at 85-90 FPS while attaining competitive results on challenging benchmarks.

Martin Barczyk | Pranoy Panda

[1] Michael J. Swain,et al. Indexing via color histograms , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[2] Michael Felsberg,et al. The Visual Object Tracking VOT2017 Challenge Results , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[3] Michael J. Swain,et al. Color indexing , 1991, International Journal of Computer Vision.

[4] Jesús Chamorro-Martínez,et al. Diatom autofocusing in brightfield microscopy: a comparative study , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[5] Silvio Savarese,et al. Learning to Track at 100 FPS with Deep Regression Networks , 2016, ECCV.

[6] Luca Bertinetto,et al. Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[7] Gang Hua,et al. A statistical field model for pedestrian detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8] David A. Wilkinson,et al. Simplified Multitarget Tracking Using the PHD Filter for Microscopic Video Data , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[9] Ali Farhadi,et al. Re$^3$: Re al-Time Recurrent Regression Networks for Visual Tracking of Generic Objects , 2017, IEEE Robotics and Automation Letters.

[10] Ali Farhadi,et al. YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] J. P. Lewis,et al. Fast Template Matching , 2009 .

[12] Alex Graves,et al. Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.

[13] Silvio Savarese,et al. Detecting and tracking people using an RGB-D camera via multiple detector fusion , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[14] K. Gegenfurtner,et al. The contributions of color to recognition memory for natural scenes. , 2002, Journal of experimental psychology. Learning, memory, and cognition.

[15] Yi Wu,et al. Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[16] Luca Bertinetto,et al. End-to-End Representation Learning for Correlation Filter Based Tracking , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).