HOOT: Heavy Occlusions in Object Tracking Benchmark

In this paper, we present HOOT, the Heavy Occlusions in Object Tracking Benchmark, a new visual object tracking dataset aimed towards handling high occlusion scenarios for single-object tracking tasks. The dataset consists of 581 high-quality videos, which have 436K frames densely annotated with rotated bounding boxes for targets spanning 74 object classes. The dataset is geared for development, evaluation and analysis of visual tracking algorithms that are robust to occlusions. It is comprised of videos with high occlusion levels, where the median percentage of occluded frames per-video is 68%. It also provides critical attributes on occlusions, which include defining a taxonomy for occluders, providing occlusion masks for every bounding box, per-frame partial/full occlusion labels and more. HOOT has been compiled to encourage development of new methods targeting occlusion handling in visual tracking, by providing training and test splits with high occlusion levels. This makes HOOT the first densely-annotated, large dataset designed for single-object tracking under severe occlusion. We evaluate 15 state-of-the-art trackers on this new dataset to act as a baseline for future work focusing on occlusions.

[1]  Yong Wang,et al.  The Ninth Visual Object Tracking VOT2021 Challenge Results , 2021, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).

[2]  Laurent Itti,et al.  Multi-Task Occlusion Learning for Real-Time Visual Object Tracking , 2021, 2021 IEEE International Conference on Image Processing (ICIP).

[3]  Yihao Liu,et al.  Learn to Match: Automatic Matching Network Design for Visual Tracking , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[4]  Jianlong Fu,et al.  Learning Spatio-Temporal Transformer for Visual Tracking , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Luc Van Gool,et al.  Learning Target Candidate Association to Keep Track of What Not to Track , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Huchuan Lu,et al.  Transformer Tracking , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Harshit,et al.  Transparent Object Tracking Benchmark , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Alberto Ferreira de Souza,et al.  Self-Driving Cars: A Survey , 2019, Expert Syst. Appl..

[9]  Xin Zhao,et al.  GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Song Bai,et al.  Occluded Video Instance Segmentation , 2021, ArXiv.

[11]  Deepak K. Gupta,et al.  Hard Occlusions in Visual Object Tracking , 2020, ECCV Workshops.

[12]  Boris Sekachev,et al.  opencv/cvat: v1.1.0 , 2020 .

[13]  Zhipeng Zhang,et al.  Ocean: Object-aware Anchor-free Tracking , 2020, ECCV.

[14]  Luc Van Gool,et al.  Probabilistic Regression for Visual Tracking , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Daniel Cremers,et al.  MOT20: A benchmark for multi object tracking in crowded scenes , 2020, ArXiv.

[16]  Ming-Hsuan Yang,et al.  UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking , 2015, Comput. Vis. Image Underst..

[17]  Fahad Shahbaz Khan,et al.  Mask-Guided Attention Network for Occluded Pedestrian Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  L. Gool,et al.  Learning Discriminative Model Prediction for Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Zhipeng Zhang,et al.  Deeper and Wider Siamese Networks for Real-Time Visual Tracking , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Wei Wu,et al.  SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Qiang Wang,et al.  Fast Online Object Tracking and Segmentation: A Unifying Approach , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Michael Felsberg,et al.  ATOM: Accurate Tracking by Overlap Maximization , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Fan Yang,et al.  LaSOT: A High-Quality Benchmark for Large-Scale Single Object Tracking , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Kevin Bouchard,et al.  Tracking objects within a smart home , 2018, Expert Syst. Appl..

[25]  Wangsheng Yu,et al.  Robust occlusion-aware part-based visual tracking with object scale adaptation , 2018, Pattern Recognit..

[26]  Wei Wu,et al.  High Performance Visual Tracking with Siamese Region Proposal Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Gi Hyun Lim,et al.  Towards lifelong assistive robotics: A tight coupling between object perception and manipulation , 2018, Neurocomputing.

[28]  Bernard Ghanem,et al.  TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild , 2018, ECCV.

[29]  Arnold W. M. Smeulders,et al.  Long-term Tracking in the Wild: A Benchmark , 2018, ECCV.

[30]  Simon Lucey,et al.  Need for Speed: A Benchmark for Higher Frame Rate Object Tracking , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[31]  Bernt Schiele,et al.  CityPersons: A Diverse Dataset for Pedestrian Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Bernard Ghanem,et al.  A Benchmark and Simulator for UAV Tracking , 2016, ECCV.

[33]  Huchuan Lu,et al.  Occlusion-Aware Fragment-Based Tracking With Spatial-Temporal Consistency , 2016, IEEE Transactions on Image Processing.

[34]  Shuicheng Yan,et al.  NUS-PRO: A New Visual Tracking Challenge , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Ming-Hsuan Yang,et al.  Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Simone Calderara,et al.  Visual Tracking: An Experimental Survey , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Jiri Matas,et al.  The VOT2013 challenge: overview and additional results , 2014 .

[38]  B. Y. Lee,et al.  Occlusion handling in videos object tracking: A survey , 2014 .

[39]  Yi Wu,et al.  Online Object Tracking: A Benchmark , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Mubarak Shah,et al.  Tracking and Object Classification for Automated Surveillance , 2002, ECCV.