Improving Image Caption Performance with Linguistic Context
暂无分享,去创建一个
[1] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.
[2] Cyrus Rashtchian,et al. Every Picture Tells a Story: Generating Sentences from Images , 2010, ECCV.
[3] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[4] Zheng Wang,et al. A deep-learning based feature hybrid framework for spatiotemporal saliency detection inside videos , 2018, Neurocomputing.
[5] David A. Forsyth,et al. Matching Words and Pictures , 2003, J. Mach. Learn. Res..
[6] Feng Wu,et al. Background Prior-Based Salient Object Detection via Deep Reconstruction Residual , 2015, IEEE Transactions on Circuits and Systems for Video Technology.
[7] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[8] Xuelong Li,et al. Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement , 2018, Pattern Recognit..
[9] Rui Zhang,et al. A Novel Deep Density Model for Unsupervised Learning , 2018, Cognitive Computation.
[10] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[11] Yiannis Aloimonos,et al. Corpus-Guided Sentence Generation of Natural Images , 2011, EMNLP.
[12] Hang Dong,et al. Joint Multi-Label Attention Networks for Social Text Annotation , 2019, NAACL.
[13] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[14] Yejin Choi,et al. Baby talk: Understanding and generating simple image descriptions , 2011, CVPR 2011.
[15] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[16] Fei Yin,et al. Handwritten Chinese Text Recognition by Integrating Multiple Contexts , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[17] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.
[18] Yoshua Bengio,et al. Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.
[19] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[20] Hui Chen,et al. Attend to Knowledge: Memory-Enhanced Attention Network for Image Captioning , 2018, BICS.
[21] Richard Socher,et al. Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Joshua Goodman,et al. A bit of progress in language modeling , 2001, Comput. Speech Lang..
[23] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Fei Yin,et al. Integrating Language Model in Handwritten Chinese Text Recognition , 2009, 2009 10th International Conference on Document Analysis and Recognition.
[26] C. Lawrence Zitnick,et al. CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[28] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[29] Nick C. Ellis,et al. Frequency effects in language acquisition: A review with implications for theories of implicit and explicit language acquisition. (Target article) , 2002 .