TarViS: A Unified Approach for Target-Based Video Segmentation
暂无分享,去创建一个
[1] D. Ramanan,et al. BURST: A Benchmark for Unifying Object Recognition, Segmentation and Tracking in Video , 2022, 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).
[2] Anima Anandkumar,et al. MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training , 2022, NeurIPS.
[3] A. Yuille,et al. In Defense of Online Models for Video Instance Segmentation , 2022, ECCV.
[4] P. Luo,et al. Towards Grand Unification of Object Tracking , 2022, ECCV.
[5] Ho Kei Cheng,et al. XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model , 2022, ECCV.
[6] Seoung Wug Oh,et al. VITA: Video Instance Segmentation via Object Token Association , 2022, NeurIPS.
[7] Yunchao Wei,et al. Large-scale Video Panoptic Segmentation in the Wild: A Benchmark , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] André Susano Pinto,et al. UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes , 2022, NeurIPS.
[9] Sergio Gomez Colmenarejo,et al. A Generalist Agent , 2022, Trans. Mach. Learn. Res..
[10] Oriol Vinyals,et al. Flamingo: a Visual Language Model for Few-Shot Learning , 2022, NeurIPS.
[11] D. Ramanan,et al. HODOR: High-level Object Descriptors for Object Re-segmentation in Video Learned from Static Images , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Liunian Harold Li,et al. Grounded Language-Image Pre-training , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Olivier J. H'enaff,et al. Perceiver IO: A General Architecture for Structured Inputs & Outputs , 2021, ICLR.
[14] Philip H. S. Torr,et al. Occluded Video Instance Segmentation: A Benchmark , 2021, International Journal of Computer Vision.
[15] Alexander G. Schwing,et al. Mask2Former for Video Instance Segmentation , 2021, ArXiv.
[16] Laura Leal-Taixé,et al. A Single-Stage, Bottom-up Approach for Occluded VIS using Spatio-temporal Embeddings , 2021, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).
[17] Euntai Kim,et al. Hierarchical Memory Matching Network for Video Object Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[18] Quoc V. Le,et al. Multi-Task Self-Training for Learning General Representations , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[19] Martin Danelljan,et al. Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation , 2021, NeurIPS.
[20] Chi-Keung Tang,et al. Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation , 2021, NeurIPS.
[21] Seoung Wug Oh,et al. Video Instance Segmentation using Inter-Frame Communication Transformers , 2021, NeurIPS.
[22] Yi Yang,et al. Associating Objects with Transformers for Video Object Segmentation , 2021, NeurIPS.
[23] In So Kweon,et al. Learning to Associate Every Segment for Video Panoptic Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Jiaya Jia,et al. Video Instance Segmentation with a Propose-Reduce Paradigm , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[25] H. Yao,et al. Efficient Regional Memory Network for Video Object Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Ho Kei Cheng,et al. Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Andrew Zisserman,et al. Perceiver: General Perception with Iterative Attention , 2021, ICML.
[28] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[29] Alec Radford,et al. Zero-Shot Text-to-Image Generation , 2021, ICML.
[30] Daniel Cremers,et al. STEP: Segmenting and Tracking Every Pixel , 2021, NeurIPS Datasets and Benchmarks.
[31] Heng Wang,et al. Is Space-Time Attention All You Need for Video Understanding? , 2021, ICML.
[32] Raquel Urtasun,et al. VideoClick: Video Object Segmentation with a Single Click , 2021, ArXiv.
[33] Alan Yuille,et al. ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Ding Liu,et al. CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation , 2020, AAAI.
[35] Chunhua Shen,et al. End-to-End Video Instance Segmentation with Transformers , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Bin Li,et al. Deformable DETR: Deformable Transformers for End-to-End Object Detection , 2020, ICLR.
[37] Philip H. S. Torr,et al. HOTA: A Higher Order Metric for Evaluating Multi-object Tracking , 2020, International Journal of Computer Vision.
[38] Song Bai,et al. SeqFormer: a Frustratingly Simple Model for Video Instance Segmentation , 2021, ArXiv.
[39] Fahad Shahbaz Khan,et al. SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation , 2020, ECCV.
[40] In So Kweon,et al. Video Panoptic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.
[42] Laura Leal-Taixé,et al. STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos , 2020, ECCV.
[43] Yunchao Wei,et al. Collaborative Video Object Segmentation by Foreground-Background Integration , 2020, ECCV.
[44] 知秀 柴田. 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .
[45] Ross B. Girshick,et al. PointRend: Image Segmentation As Rendering , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Gedas Bertasius,et al. Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[47] Maxwell D. Collins,et al. Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Trevor Darrell,et al. BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning , 2018, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[49] Kilian Y. Pfeiffer,et al. Visual Person Understanding through Multi-Task and Multi-Dataset Learning , 2019, GCPR.
[50] Yuchen Fan,et al. Video Instance Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[51] Ning Xu,et al. Video Object Segmentation Using Space-Time Memory Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[52] Bastian Leibe,et al. FEELVOS: Fast End-To-End Embedding Learning for Video Object Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[53] Andreas Geiger,et al. MOTS: Multi-Object Tracking and Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[54] Kaiming He,et al. Panoptic Feature Pyramid Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[55] Fan Yang,et al. LaSOT: A High-Quality Benchmark for Large-Scale Single Object Tracking , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[56] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[57] Leonidas J. Guibas,et al. Taskonomy: Disentangling Task Transfer Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[58] Bernard Ghanem,et al. TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild , 2018, ECCV.
[59] Qiang Yang,et al. An Overview of Multi-task Learning , 2018 .
[60] Peter Kontschieder,et al. The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[61] Bolei Zhou,et al. Scene Parsing through ADE20K Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[62] Sebastian Ruder,et al. An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.
[63] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[64] Luc Van Gool,et al. The 2017 DAVIS Challenge on Video Object Segmentation , 2017, ArXiv.
[65] Iasonas Kokkinos,et al. UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[66] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[67] Stefan Roth,et al. MOT16: A Benchmark for Multi-Object Tracking , 2016, ArXiv.
[68] Jitendra Malik,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence Segmentation of Moving Objects by Long Term Video Analysis , 2022 .
[69] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[70] Charless C. Fowlkes,et al. Globally-optimal greedy algorithms for tracking a variable number of objects , 2011, CVPR 2011.
[71] Jitendra Malik,et al. Object Segmentation by Long Term Analysis of Point Trajectories , 2010, ECCV.
[72] Luc Van Gool,et al. Moving obstacle detection in highly dynamic scenes , 2009, 2009 IEEE International Conference on Robotics and Automation.
[73] Luc Van Gool,et al. Coupled Object Detection and Tracking from Static Cameras and Moving Vehicles , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[74] Horst Bischof,et al. Real-Time Tracking via On-line Boosting , 2006, BMVC.
[75] Demetri Terzopoulos,et al. Snakes: Active contour models , 2004, International Journal of Computer Vision.
[76] Larry S. Davis,et al. Non-parametric Model for Background Subtraction , 2000, ECCV.
[77] W. Eric L. Grimson,et al. Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).
[78] Jitendra Malik,et al. Robust Multiple Car Tracking with Occlusion Reasoning , 1994, ECCV.
[79] Rich Caruana,et al. Multitask Learning: A Knowledge-Based Source of Inductive Bias , 1993, ICML.