Tracking by Natural Language Specification with Long Short-term Context Decoupling
暂无分享,去创建一个
[1] Bohyung Han,et al. Towards Sequence-Level Training for Visual Tracking , 2022, ECCV.
[2] Junsong Yuan,et al. AiATrack: Attention in Attention for Transformer Visual Tracking , 2022, ECCV.
[3] Junqing Yu,et al. Transformer Tracking with Cyclic Shifting Window Attention , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Limin Wang,et al. MixFormer: End-to-End Tracking with Iterative Mixed Attention , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[5] L. Gool,et al. Transforming Model Prediction for Tracking , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Wanli Ouyang,et al. Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking , 2022, ECCV.
[7] Ding Ma,et al. Capsule-based Object Tracking with Natural Language Specification , 2021, ACM Multimedia.
[8] Hanqing Lu,et al. High-Performance Discriminative Tracking with Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[9] S. Sclaroff,et al. Siamese Natural Language Tracker: Tracking by Natural Language Descriptions with Siamese Trackers , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Wengang Zhou,et al. TransVG: End-to-End Visual Grounding with Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[11] Rong Jin,et al. Self-supervised Video Representation Learning by Context and Motion Decoupling , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Yonghong Tian,et al. Towards More Flexible and Accurate Object Tracking with Natural Language: Algorithms and Benchmark , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Jianlong Fu,et al. Learning Spatio-Temporal Transformer for Visual Tracking , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[14] Huchuan Lu,et al. Transformer Tracking , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Wengang Zhou,et al. Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Jianbo Jiao,et al. Self-supervised Video Representation Learning by Pace Prediction , 2020, ECCV.
[17] Yong Jae Lee,et al. Audiovisual SlowFast Networks for Video Recognition , 2020, ArXiv.
[18] Stan Sclaroff,et al. Robust Visual Object Tracking with Natural Language Region Proposal Network , 2019, ArXiv.
[19] D. Mahajan,et al. Self-Supervised Learning by Cross-Modal Audio-Video Clustering , 2019, NeurIPS.
[20] Ross B. Girshick,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Cordelia Schmid,et al. Learning Video Representations using Contrastive Bidirectional Transformer , 2019 .
[22] S. Sclaroff,et al. Real-time Visual Object Tracking with Natural Language Description , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).
[23] Hao Chen,et al. FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[24] Silvio Savarese,et al. Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Fan Yang,et al. LaSOT: A High-Quality Benchmark for Large-Scale Single Object Tracking , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[27] Ross B. Girshick,et al. Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization , 2018, NeurIPS.
[28] Wei Wu,et al. High Performance Visual Tracking with Siamese Region Proposal Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[29] Cees G. M. Snoek,et al. Tracking by Natural Language Specification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Xin Pan,et al. YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[32] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Ming-Hsuan Yang,et al. Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[34] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.