暂无分享,去创建一个
[1] Gregory R. Koch,et al. Siamese Neural Networks for One-Shot Image Recognition , 2015 .
[2] David J. Fleet,et al. VSE++: Improving Visual-Semantic Embeddings with Hard Negatives , 2017, BMVC.
[3] Manh-Duy Nguyen,et al. Graph-Based Indexing and Retrieval of Lifelog Data , 2021, MMM.
[4] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[5] Michael S. Bernstein,et al. Image retrieval using scene graphs , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Zhedong Zheng,et al. Dual-path Convolutional Image-Text Embeddings with Instance Loss , 2017, ACM Trans. Multim. Comput. Commun. Appl..
[7] Jia-Bin Huang,et al. 3D Photography Using Context-Aware Layered Depth Inpainting , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Stefan Lee,et al. Graph R-CNN for Scene Graph Generation , 2018, ECCV.
[9] Peter Young,et al. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions , 2014, TACL.
[10] Basura Fernando,et al. SPICE: Semantic Propositional Image Caption Evaluation , 2016, ECCV.
[11] Xueming Qian,et al. Position Focused Attention Network for Image-Text Matching , 2019, IJCAI.
[12] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Ji Liu,et al. IMRAM: Iterative Matching With Recurrent Attention Memory for Cross-Modal Image-Text Retrieval , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Nan Duan,et al. Knowledge Aware Semantic Concept Expansion for Image-Text Matching , 2019, IJCAI.
[15] Pham The Bao,et al. Accurate brain extraction using Active Shape Model and Convolutional Neural Networks , 2018, ArXiv.
[16] Yongdong Zhang,et al. Multi-Modality Cross Attention Network for Image and Sentence Matching , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.
[18] Svetlana Lazebnik,et al. Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[19] Xiaogang Wang,et al. Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation , 2018, ECCV.
[20] Yizhou Sun,et al. Unsupervised Inductive Graph-Level Representation Learning via Graph-Graph Proximity , 2019, IJCAI.
[21] Xilin Chen,et al. Cross-modal Scene Graph Matching for Relationship-aware Image-Text Retrieval , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).
[22] Wei Liu,et al. Learning to Compose Dynamic Tree Structures for Visual Contexts , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Li Fei-Fei,et al. Image Generation from Scene Graphs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[24] Philip S. Yu,et al. A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.
[25] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.
[26] Chunxiao Liu,et al. Graph Structured Network for Image-Text Matching , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Yejin Choi,et al. Neural Motifs: Scene Graph Parsing with Global Context , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[28] Martin Wattenberg,et al. Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.
[29] Yang Yang,et al. Adversarial Cross-Modal Retrieval , 2017, ACM Multimedia.
[30] Xi Chen,et al. Stacked Cross Attention for Image-Text Matching , 2018, ECCV.
[31] Yan Huang,et al. Learning Semantic Concepts and Order for Image and Sentence Matching , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[32] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[33] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[34] Ning Xu,et al. Scene graph captioner: Image captioning based on structural visual representation , 2019, J. Vis. Commun. Image Represent..