Boon: A Neural Search Engine for Cross-Modal Information Retrieval
暂无分享,去创建一个
[1] F. Fischer,et al. ChatGPT for good? On opportunities and challenges of large language models for education , 2023, Learning and Individual Differences.
[2] Axel Finke,et al. VITR: Augmenting Vision Transformers with Relation-Focused Learning for Cross-Modal Information Retrieval , 2023, 2302.06350.
[3] G. Shih,et al. ChatGPT and Other Large Language Models Are Double-edged Swords. , 2023, Radiology.
[4] chatGPT,et al. A Conversation on Artificial Intelligence, Chatbots, and Plagiarism in Higher Education , 2023, Cellular and Molecular Bioengineering.
[5] Yizhao Gao,et al. COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Errui Ding,et al. ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Y. Fu,et al. Image-Text Embedding Learning via Visual and Textual Semantic Reasoning , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[8] Lu Yuan,et al. RegionCLIP: Region-based Language-Image Pretraining , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Hui Fang,et al. On the Limitations of Visual-Semantic Embedding Networks for Image-to-Text Information Retrieval , 2021, J. Imaging.
[10] Yan Peng,et al. Dual-stream Network for Visual Recognition , 2021, NeurIPS.
[11] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[12] Zi Huang,et al. Aggregation-Based Graph Convolutional Hashing for Unsupervised Cross-Modal Retrieval , 2021, IEEE Transactions on Multimedia.
[13] Yuning Jiang,et al. Learning the Best Pooling Strategy for Visual Semantic Embedding , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Andrea Esuli,et al. Fine-Grained Visual Textual Alignment for Cross-Modal Retrieval Using Transformer Encoders , 2020, ACM Trans. Multim. Comput. Commun. Appl..
[15] Weifeng Zhang,et al. Cross-modal Knowledge Reasoning for Knowledge-based Visual Question Answering , 2020, Pattern Recognit..
[16] Hao Tian,et al. ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph , 2020, AAAI.
[17] Yu Cheng,et al. Large-Scale Adversarial Training for Vision-and-Language Representation Learning , 2020, NeurIPS.
[18] Yongdong Zhang,et al. Multi-Modality Cross Attention Network for Image and Sentence Matching , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Qi Zhang,et al. Context-Aware Attention Network for Image-Text Retrieval , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[21] Hao Wang,et al. FashionBERT: Text and Image Matching with Adaptive Loss for Cross-modal Retrieval , 2020, SIGIR.
[22] Yu Cheng,et al. UNITER: UNiversal Image-TExt Representation Learning , 2019, ECCV.
[23] Yun Fu,et al. Visual Semantic Reasoning for Image-Text Matching , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[24] Kilian Q. Weinberger,et al. BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.
[25] Xi Chen,et al. Stacked Cross Attention for Image-Text Matching , 2018, ECCV.
[26] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[27] David J. Fleet,et al. VSE++: Improving Visual-Semantic Embeddings with Hard Negatives , 2017, BMVC.
[28] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[29] Kenneth Ward Church,et al. Word2Vec , 2016, Natural Language Engineering.
[30] Max Welling,et al. Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.
[31] Saurabh Singh,et al. Where to Look: Focus Regions for Visual Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Alan L. Yuille,et al. Generation and Comprehension of Unambiguous Object Descriptions , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[34] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[37] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[38] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[39] Peter Young,et al. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions , 2014, TACL.
[40] Yong Rui,et al. Image search—from thousands to billions in 20 years , 2013, TOMCCAP.
[41] Timothy Baldwin,et al. langid.py: An Off-the-shelf Language Identification Tool , 2012, ACL.
[42] David G. Lowe,et al. Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.
[43] S. Hochreiter,et al. Long Short-Term Memory , 1997, Neural Computation.
[44] Tefko Saracevic,et al. Evaluation of evaluation in information retrieval , 1995, SIGIR '95.
[45] Hao Yang,et al. PFAN++: Bi-Directional Image-Text Retrieval With Position Focused Attention Network , 2021, IEEE Transactions on Multimedia.
[46] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[47] Elif Derya Übeyli,et al. Recurrent Neural Networks , 2018 .