RoI Pooling Based Fast Multi-Domain Convolutional Neural Networks for Visual Tracking

This paper proposes a fast multi-domain convolutional neural networks method (Fast MDNet) for visual tracking. Fast MDNet builds on fast region-based convolutional neural networks (Fast R-CNN) and MDNet to efficiently track arbitrary objects using deep convolutional networks. We introduce a RoI pooling layer which shares full-image convolutional features, thus significantly speed up MDNet. Compared to previous works, Fast MDNet’s online tracking rate is 15x faster than MDNet, and it performs favorably against the state-of-the-art methods on large benchmark datasets. Keywords-component; visual tracking; Fast MDNet; CNN; RoI

[1]  Ruimin Hu,et al.  Improved Object Tracking Algorithm Based on New HSV Color Probability Model , 2009, ISNN.

[2]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Michael Felsberg,et al.  The Visual Object Tracking VOT2015 Challenge Results , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[4]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[5]  Shai Avidan,et al.  Ensemble Tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Bohyung Han,et al.  Learning Multi-domain Convolutional Neural Networks for Visual Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Stan Sclaroff,et al.  MEEM: Robust Tracking via Multiple Experts Using Entropy Minimization , 2014, ECCV.

[12]  Vibhav Vineet,et al.  Struck: Structured Output Tracking with Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Qingming Huang,et al.  Hedged Deep Tracking , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Ming-Hsuan Yang,et al.  Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Zhongfei Zhang,et al.  A survey of appearance models in visual object tracking , 2013, ACM Trans. Intell. Syst. Technol..

[16]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Neural Networks , 2013 .

[17]  Horst Bischof,et al.  On-line Random Forests , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[18]  Simone Calderara,et al.  Visual Tracking: An Experimental Survey , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[20]  Kevin Cannons,et al.  A Review of Visual Tracking , 2008 .

[21]  Stan Z. Li,et al.  Online Spatio-temporal Structural Context Learning for Visual Tracking , 2012, ECCV.

[22]  Jiri Matas,et al.  P-N learning: Bootstrapping binary classifiers by structural constraints , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[24]  Ehud Rivlin,et al.  Robust Fragments-based Tracking using the Integral Histogram , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[25]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Shai Avidan,et al.  Support vector tracking , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[28]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.