暂无分享,去创建一个
[1] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[2] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.
[3] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[4] Li Deng,et al. Deep Learning for Image-to-Text Generation: A Technical Overview , 2017, IEEE Signal Processing Magazine.
[5] Huaiyu Zhu. On Information and Sufficiency , 1997 .
[6] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.
[7] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.
[8] Yu Zhou,et al. MSMO: Multimodal Summarization with Multimodal Output , 2018, EMNLP.
[9] Dragomir R. Radev,et al. LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..
[10] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[11] Mirella Lapata,et al. Text Summarization with Pretrained Encoders , 2019, EMNLP.
[12] Yu Zhou,et al. Multimodal Summarization with Guidance of Multimodal Reference , 2020, AAAI.
[13] Bowen Zhou,et al. SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.
[14] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[15] Rohini K. Srihari,et al. Computational models for integrating linguistic and visual information: A survey , 2004, Artificial Intelligence Review.
[16] Georg Heigold,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.
[17] Yanan Cao,et al. Flexible Non-Autoregressive Extractive Summarization with Threshold: How to Extract a Non-Fixed Number of Summary Sentences , 2021, AAAI.
[18] Pengfei Liu,et al. GSum: A General Framework for Guided Neural Abstractive Summarization , 2021, NAACL.
[19] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[20] Jun Wang,et al. LAMS: A Location-aware Approach for Multimodal Summarization (Student Abstract) , 2021, AAAI.
[21] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Jaemin Cho,et al. Unifying Vision-and-Language Tasks via Text Generation , 2021, ICML.
[23] Louis-Philippe Morency,et al. Multimodal Machine Learning: A Survey and Taxonomy , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[24] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[25] Furu Wei,et al. Faithful to the Original: Fact Aware Neural Abstractive Summarization , 2017, AAAI.
[26] Ildoo Kim,et al. ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision , 2021, ICML.