暂无分享,去创建一个
Shih-Fu Chang | Wei Liu | Hanwang Zhang | Long Chen | Jun Xiao | Wenbo Ma | Shih-Fu Chang | Hanwang Zhang | Long Chen | Jun Xiao | Wenbo Ma | Wei Liu
[1] Pablo Arbeláez,et al. Dynamic Multimodal Instance Segmentation guided by natural language queries , 2018, ECCV.
[2] Larry S. Davis,et al. Modeling Context Between Objects for Referring Expression Understanding , 2016, ECCV.
[3] Shiliang Pu,et al. Counterfactual Samples Synthesizing for Robust Visual Question Answering , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Yichen Wei,et al. Relation Networks for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[5] Qi Qian,et al. Learning to Rank Proposals for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[6] Yuning Jiang,et al. Acquisition of Localization Confidence for Accurate Object Detection , 2018, ECCV.
[7] John F. Canny,et al. Grounding Human-To-Vehicle Advice for Self-Driving Vehicles , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Xiaogang Wang,et al. Improving Referring Expression Grounding With Cross-Modal Attention-Guided Erasing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Hanwang Zhang,et al. Learning to Assemble Neural Module Tree Networks for Visual Grounding , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[10] Lianli Gao,et al. Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.
[12] Ramakant Nevatia,et al. Query-Guided Regression Network with Context Policy for Phrase Grounding , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[13] Long Chen,et al. Counterfactual Critic Multi-Agent Training for Scene Graph Generation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[14] Licheng Yu,et al. A Joint Speaker-Listener-Reinforcer Model for Referring Expressions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Huchuan Lu,et al. Bi-Directional Relationship Inferring Network for Referring Image Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Markus H. Gross,et al. Neural Sequential Phrase Grounding (SeqGROUND) , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Trevor Darrell,et al. Natural Language Object Retrieval , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Margrit Betke,et al. Learning to Separate: Detecting Heavily-Occluded Objects in Urban Scenes , 2019, ECCV.
[19] Jiebo Luo,et al. A Fast and Accurate One-Stage Approach to Visual Grounding , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[20] Bernt Schiele,et al. Learning Non-maximum Suppression , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Zhiwu Lu,et al. Variational Context: Exploiting Visual and Textual Context for Grounding Referring Expressions , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[22] Chen Qian,et al. A Real-Time Cross-Modality Correlation Filtering Method for Referring Expression Comprehension , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Zheng-Jun Zha,et al. Joint Visual Grounding with Language Scene Graphs , 2019 .
[24] Chenxi Liu,et al. Recurrent Multimodal Interaction for Referring Image Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[25] Yunchao Wei,et al. Referring Image Segmentation via Cross-Modal Progressive Comprehension , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Yang Wang,et al. Cross-Modal Self-Attention Network for Referring Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Lars Petersson,et al. Improving Object Localization with Fitness NMS and Bounded IoU Loss , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[28] Richang Hong,et al. Learning to Compose and Reason with Language Tree Structures for Visual Grounding , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[29] Lin Ma,et al. Real-Time Referring Expression Comprehension by Single-Stage Grounding Network , 2018, ArXiv.
[30] Tat-Seng Chua,et al. SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Jian Shao,et al. Boundary Proposal Network for Two-Stage Natural Language Video Localization , 2021, AAAI.
[32] Qi Wu,et al. Parallel Attention: A Unified Framework for Visual Object Discovery Through Dialogs and Queries , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[33] Trevor Darrell,et al. Segmentation from Natural Language Expressions , 2016, ECCV.
[34] Licheng Yu,et al. Modeling Context in Referring Expressions , 2016, ECCV.
[35] Yizhou Yu,et al. Dynamic Graph Attention for Referring Expression Comprehension , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[36] Xiaojuan Qi,et al. Referring Image Segmentation via Recurrent Refinement Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[37] Shih-Fu Chang,et al. Grounding Referring Expressions in Images by Variational Context , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[38] Licheng Yu,et al. MAttNet: Modular Attention Network for Referring Expression Comprehension , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[39] Jiebo Luo,et al. Improving One-stage Visual Grounding by Recursive Sub-query Construction , 2020, ECCV.
[40] Liujuan Cao,et al. Multi-Task Collaborative Network for Joint Referring Expression Comprehension and Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[42] Long Chen,et al. Rethinking the Bottom-Up Framework for Query-Based Video Localization , 2020, AAAI.
[43] Trevor Darrell,et al. Modeling Relationships in Referential Expressions with Compositional Modular Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Yizhou Yu,et al. Graph-Structured Referring Expression Reasoning in the Wild , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[46] Yaser Al-Onaizan,et al. Words Aren’t Enough, Their Order Matters: On the Robustness of Grounding Visual Referring Expressions , 2020, ACL.
[47] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[48] Larry S. Davis,et al. Soft-NMS — Improving Object Detection with One Line of Code , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[49] Shih-Fu Chang,et al. Multi-Level Multimodal Common Semantic Space for Image-Phrase Grounding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[50] C. Lawrence Zitnick,et al. Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.
[51] Vicente Ordonez,et al. ReferItGame: Referring to Objects in Photographs of Natural Scenes , 2014, EMNLP.
[52] Yunhong Wang,et al. Adaptive NMS: Refining Pedestrian Detection in a Crowd , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[53] Alan L. Yuille,et al. Generation and Comprehension of Unambiguous Object Descriptions , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[54] Hongliang Li,et al. Key-Word-Aware Network for Referring Expression Image Segmentation , 2018, ECCV.
[55] Yoav Artzi,et al. TOUCHDOWN: Natural Language Navigation and Spatial Reasoning in Visual Street Environments , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[56] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[57] Hwann-Tzong Chen,et al. See-Through-Text Grouping for Referring Image Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[58] Zhou Yu,et al. Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding , 2018, IJCAI.