Cross-modal multi-relationship aware reasoning for image-text matching
暂无分享,去创建一个
Linbo Qing | Luping Liu | Xiaohai He | Jin Zhang | Xiaodong Luo | L. Qing | Luping Liu | Jin Zhang | Xiaohai He | Xiaodong Luo
[1] Yang Yang,et al. Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking , 2019, ACM Multimedia.
[2] Xiaogang Wang,et al. Identity-Aware Textual-Visual Matching with Latent Co-attention , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[3] Za'er Salim Abo-Hammour,et al. Numerical solution of systems of second-order boundary value problems using continuous genetic algorithm , 2014, Inf. Sci..
[4] Yongdong Zhang,et al. Focus Your Attention: A Bidirectional Focal Attention Network for Image-Text Matching , 2019, ACM Multimedia.
[5] Jiebo Luo,et al. Relational Reasoning using Prior Knowledge for Visual Captioning , 2019, ArXiv.
[6] Richard S. Zemel,et al. Gated Graph Sequence Neural Networks , 2015, ICLR.
[7] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[8] Trevor Darrell,et al. Sequence to Sequence -- Video to Text , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[9] Lin Ma,et al. Bidirectional image-sentence retrieval by local and global deep matching , 2019, Neurocomputing.
[10] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Rita Cucchiara,et al. A unified cycle-consistent neural model for text and image retrieval , 2020, Multimedia Tools and Applications.
[12] Yu Cheng,et al. Relation-Aware Graph Attention Network for Visual Question Answering , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[13] Zhoujun Li,et al. Bi-Directional Spatial-Semantic Attention Networks for Image-Text Matching , 2019, IEEE Transactions on Image Processing.
[14] Lior Wolf,et al. Associating neural word embeddings with deep image representations using Fisher Vectors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Ah Chung Tsoi,et al. The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.
[16] Heng Tao Shen,et al. Cross-Modal Attention With Semantic Consistence for Image–Text Matching , 2020, IEEE Transactions on Neural Networks and Learning Systems.
[17] Marc'Aurelio Ranzato,et al. DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.
[18] Stefan Lee,et al. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks , 2019, NeurIPS.
[19] Peter Young,et al. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions , 2014, TACL.
[20] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[21] Tao Mei,et al. Exploring Visual Relationship for Image Captioning , 2018, ECCV.
[22] Pietro Liò,et al. Graph Attention Networks , 2017, ICLR.
[23] David J. Fleet,et al. VSE++: Improving Visual-Semantic Embeddings with Hard Negatives , 2017, BMVC.
[24] Richard Socher,et al. Interpretable Counting for Visual Question Answering , 2017, ICLR.
[25] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[26] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[27] Yin Li,et al. Learning Deep Structure-Preserving Image-Text Embeddings , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Xi Chen,et al. Stacked Cross Attention for Image-Text Matching , 2018, ECCV.
[29] Yao Zhao,et al. Cross-Modal Retrieval With CNN Visual Features: A New Baseline , 2017, IEEE Transactions on Cybernetics.
[30] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[31] Xiao Lin,et al. Leveraging Visual Question Answering for Image-Caption Ranking , 2016, ECCV.
[32] Jung-Woo Ha,et al. Dual Attention Networks for Multimodal Reasoning and Matching , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Lianli Gao,et al. Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Erwin M. Bakker,et al. CycleMatch: A cycle-consistent embedding network for image-text matching , 2019, Pattern Recognit..
[35] Andrea Esuli,et al. Transformer Reasoning Network for Image- Text Matching and Retrieval , 2021, 2020 25th International Conference on Pattern Recognition (ICPR).
[36] Liwei Wang,et al. Learning Two-Branch Neural Networks for Image-Text Matching Tasks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[37] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[38] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[39] Chunxiao Liu,et al. Graph Structured Network for Image-Text Matching , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Max Welling,et al. Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.
[41] Gang Wang,et al. Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[42] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[43] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[44] Zhedong Zheng,et al. Dual-path Convolutional Image-Text Embeddings with Instance Loss , 2017, ACM Trans. Multim. Comput. Commun. Appl..
[45] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[46] Yan Huang,et al. Learning Semantic Concepts and Order for Image and Sentence Matching , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[47] Huchuan Lu,et al. Deep Cross-Modal Projection Learning for Image-Text Matching , 2018, ECCV.
[48] Hai Zhuge,et al. Extractive summarization of documents with images based on multi-modal RNN , 2019, Future Gener. Comput. Syst..
[49] Lin Ma,et al. Multimodal Convolutional Neural Networks for Matching Image and Sentence , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[50] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.
[51] Yun Fu,et al. Visual Semantic Reasoning for Image-Text Matching , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[52] Xueming Qian,et al. Position Focused Attention Network for Image-Text Matching , 2019, IJCAI.
[53] Jonathon S. Hare,et al. Learning to Count Objects in Natural Images for Visual Question Answering , 2018, ICLR.
[54] Sarah Parisot,et al. Learning Conditioned Graph Structures for Interpretable Visual Question Answering , 2018, NeurIPS.
[55] Wei Wang,et al. Instance-Aware Image and Sentence Matching with Selective Multimodal LSTM , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).