Triangle-Reward Reinforcement Learning: A Visual-Linguistic Semantic Alignment for Image Captioning
暂无分享,去创建一个
Yongdong Zhang | Weizhi Nie | An-An Liu | Ning Xu | Xuanya Li | Jiesi Li | N. Xu | Anan Liu | Weizhi Nie | Yongdong Zhang | Xuanya Li | Jiesi Li
[1] Tao Mei,et al. Exploring Visual Relationship for Image Captioning , 2018, ECCV.
[2] Jin Young Choi,et al. Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Yi Yang,et al. Convolutional Reconstruction-to-Sequence for Video Captioning , 2020, IEEE Transactions on Circuits and Systems for Video Technology.
[4] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[5] Cyrus Rashtchian,et al. Every Picture Tells a Story: Generating Sentences from Images , 2010, ECCV.
[6] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[7] Yongdong Zhang,et al. Part-Aware Interactive Learning for Scene Graph Generation , 2020, ACM Multimedia.
[8] Yongdong Zhang,et al. Double-Bit Quantization and Index Hashing for Nearest Neighbor Search , 2019, IEEE Transactions on Multimedia.
[9] Vaibhava Goel,et al. Self-Critical Sequence Training for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[11] Kenneth Ward Church,et al. Word2Vec , 2016, Natural Language Engineering.
[12] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[14] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[15] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[16] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[17] Wei Liu,et al. Recurrent Fusion Network for Image Captioning , 2018, ECCV.
[18] Yejin Choi,et al. Neural Motifs: Scene Graph Parsing with Global Context , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[19] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.
[21] Dan Klein,et al. Accurate Unlexicalized Parsing , 2003, ACL.
[22] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[23] C. Lawrence Zitnick,et al. CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Li Fei-Fei,et al. Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval , 2015, VL@EMNLP.
[25] Wei Liu,et al. Learning to Guide Decoding for Image Captioning , 2018, AAAI.
[26] Yongdong Zhang,et al. Multi-Level Policy and Reward-Based Deep Reinforcement Learning Framework for Image Captioning , 2020, IEEE Transactions on Multimedia.
[27] Jianwei Yang,et al. Neural Baby Talk , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[28] Liqiang Nie,et al. Scalable Deep Hashing for Large-Scale Social Image Retrieval , 2020, IEEE Transactions on Image Processing.
[29] Jonathon S. Hare,et al. Learning to Count Objects in Natural Images for Visual Question Answering , 2018, ICLR.
[30] Yongdong Zhang,et al. Adaptively Clustering-Driven Learning for Visual Relationship Detection , 2020, IEEE Transactions on Multimedia.
[31] Simao Herdade,et al. Image Captioning: Transforming Objects into Words , 2019, NeurIPS.
[32] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[33] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[34] Svetlana Lazebnik,et al. Two Can Play This Game: Visual Dialog with Discriminative Question Generation and Answering , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[35] Eduard H. Hovy,et al. Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.
[36] Hongtao Lu,et al. Look Back and Predict Forward in Image Captioning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Siqi Liu,et al. Improved Image Captioning via Policy Gradient optimization of SPIDEr , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[38] Shaomei Wu,et al. Automatic Alt-text: Computer-generated Image Descriptions for Blind Users on a Social Network Service , 2017, CSCW.
[39] Long Chen,et al. Counterfactual Critic Multi-Agent Training for Scene Graph Generation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[40] Ning Zhang,et al. Deep Reinforcement Learning-Based Image Captioning with Embedding Reward , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Eric Brachmann,et al. PoseAgent: Budget-Constrained 6D Object Pose Estimation via Reinforcement Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Shouling Ji,et al. Deep Dual Consecutive Network for Human Pose Estimation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[43] Zhenguang Liu,et al. Combining Graph Neural Networks With Expert Knowledge for Smart Contract Vulnerability Detection , 2021, IEEE Transactions on Knowledge and Data Engineering.
[44] Basura Fernando,et al. SPICE: Semantic Propositional Image Caption Evaluation , 2016, ECCV.
[45] Jianfei Cai,et al. Auto-Encoding Scene Graphs for Image Captioning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Feng Liu,et al. Actor-Critic Sequence Training for Image Captioning , 2017, ArXiv.
[47] Armand Joulin,et al. Deep Fragment Embeddings for Bidirectional Image Sentence Mapping , 2014, NIPS.
[48] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[49] Gang Wang,et al. Stack-Captioning: Coarse-to-Fine Learning for Image Captioning , 2017, AAAI.
[50] Gang Hua,et al. Collaborative Deep Reinforcement Learning for Joint Object Search , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[51] Yongdong Zhang,et al. Context-Aware Visual Policy Network for Sequence-Level Image Captioning , 2018, ACM Multimedia.