PAM: Understanding Product Images in Cross Product Category Attribute Extraction
暂无分享,去创建一个
Rongmei Lin | Nasser Zalmout | Xin Luna Dong | Xiang He | Jie Feng | Yan Liang | Li Xiong | J. Feng | Rongmei Lin | Yan Liang | Li Xiong | Xin Dong | Nasser Zalmout | Xiang He
[1] Stefan Lee,et al. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks , 2019, NeurIPS.
[2] Ernest Valveny,et al. Word Spotting and Recognition with Embedded Attributes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[3] Lei Zhang,et al. VinVL: Making Visual Representations Matter in Vision-Language Models , 2021, ArXiv.
[4] Xinlei Chen,et al. Towards VQA Models That Can Read , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[6] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[7] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[8] Byoung-Tak Zhang,et al. Bilinear Attention Networks , 2018, NeurIPS.
[9] Mohit Bansal,et al. LXMERT: Learning Cross-Modality Encoder Representations from Transformers , 2019, EMNLP.
[10] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[11] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.
[12] Trevor Darrell,et al. Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Xinyu Jiang,et al. Scaling up Open Tagging from Tens to Thousands: Comprehension Empowered Attribute Value Extraction from Product Title , 2019, ACL.
[15] Feifei Li,et al. OpenTag: Open Attribute Value Extraction from Product Profiles , 2018, KDD.
[16] Jiebo Luo,et al. TAP: Text-Aware Pre-training for Text-VQA and Text-Caption , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Yue Wang,et al. Multimodal Joint Attribute Prediction and Value Extraction for E-commerce Product , 2020, EMNLP.
[18] Li Yang,et al. Learning to Extract Attribute Value from Product via Question Answering: A Multi-task Approach , 2020, KDD.
[19] Wei Xu,et al. Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.
[20] Jing Huang,et al. Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting , 2020, ECCV.
[21] P. Serdyukov,et al. Sequence Modeling with Unconstrained Generation Order , 2019, NeurIPS.
[22] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[23] Fabio Petroni,et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks , 2020, NeurIPS.
[24] Jun Ma,et al. AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of Types , 2020, KDD.
[25] Yoshua Bengio,et al. Gated Feedback Recurrent Neural Networks , 2015, ICML.
[26] Yejin Choi,et al. The Curious Case of Neural Text Degeneration , 2019, ICLR.
[27] Xin Luna Dong,et al. TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories , 2020, ACL.