An Empirical Study of Language CNN for Image Captioning
暂无分享,去创建一个
Gang Wang | Jianfei Cai | Tsuhan Chen | Jiuxiang Gu | G. Wang | Tsuhan Chen | Jianfei Cai | Jiuxiang Gu
[1] C. Lawrence Zitnick,et al. CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Xu Jia,et al. Guiding Long-Short Term Memory for Image Caption Generation , 2015, ArXiv.
[3] Jason Weston,et al. Memory Networks , 2014, ICLR.
[4] Ye Yuan,et al. Review Networks for Caption Generation , 2016, NIPS.
[5] Gang Wang,et al. Deep Level Sets for Salient Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Jiebo Luo,et al. Image Captioning with Semantic Attention , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Zhengdong Lu,et al. A Convolutional Architecture for Word Sequence Prediction , 2015 .
[8] Zhe Gan,et al. Variational Autoencoder for Deep Learning of Images, Labels and Captions , 2016, NIPS.
[9] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[10] Hang Li,et al. Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.
[11] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[12] Ruslan Salakhutdinov,et al. Multimodal Neural Language Models , 2014, ICML.
[13] Wei Xu,et al. Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) , 2014, ICLR.
[14] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Ting Liu,et al. Recent advances in convolutional neural networks , 2015, Pattern Recognit..
[16] Gang Wang,et al. Global Context-Aware Attention LSTM Networks for 3D Action Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Basura Fernando,et al. SPICE: Semantic Propositional Image Caption Evaluation , 2016, ECCV.
[18] Geoffrey Zweig,et al. From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Vaibhava Goel,et al. Self-Critical Sequence Training for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[21] Li Fei-Fei,et al. DenseCap: Fully Convolutional Localization Networks for Dense Captioning , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.
[23] Peter Young,et al. Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics , 2013, J. Artif. Intell. Res..
[24] Alon Lavie,et al. Meteor Universal: Language Specific Translation Evaluation for Any Target Language , 2014, WMT@ACL.
[25] Chunhua Shen,et al. What Value Do Explicit High Level Concepts Have in Vision to Language Problems? , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Yejin Choi,et al. Baby talk: Understanding and generating simple image descriptions , 2011, CVPR 2011.
[27] Gang Wang,et al. Recurrent Attentional Networks for Saliency Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Xu Jia,et al. Guiding the Long-Short Term Memory Model for Image Caption Generation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[29] Chenxi Liu,et al. Attention Correctness in Neural Image Captioning , 2016, AAAI.
[30] Tamara L. Berg,et al. Baby Talk : Understanding and Generating Image Descriptions , 2011 .
[31] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Restarts , 2016, ArXiv.
[32] Tianbao Yang,et al. Learning Attributes Equals Multi-Source Domain Generalization , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[34] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[35] Vicente Ordonez,et al. Im2Text: Describing Images Using 1 Million Captioned Photographs , 2011, NIPS.
[36] Gang Wang,et al. Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition , 2016, ECCV.
[37] Barnabás Póczos,et al. The Statistical Recurrent Unit , 2017, ICML.
[38] Qun Liu,et al. genCNN: A Convolutional Architecture for Word Sequence Prediction , 2015, ACL.
[39] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[40] Svetlana Lazebnik,et al. Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[41] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[42] Jürgen Schmidhuber,et al. Recurrent Highway Networks , 2016, ICML.
[43] Phil Blunsom,et al. Recurrent Continuous Translation Models , 2013, EMNLP.
[44] Phil Blunsom,et al. A Convolutional Neural Network for Modelling Sentences , 2014, ACL.
[45] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[46] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[47] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.
[48] Geoffrey E. Hinton,et al. A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.
[49] Ye Yuan,et al. Encode, Review, and Decode: Reviewer Module for Caption Generation , 2016, ArXiv.
[50] Jürgen Schmidhuber,et al. Training Very Deep Networks , 2015, NIPS.
[51] Richard Socher,et al. Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[53] Xinlei Chen,et al. Mind's eye: A recurrent visual representation for image caption generation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[54] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[55] Dumitru Erhan,et al. Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[56] Daniel Jurafsky,et al. A Hierarchical Neural Autoencoder for Paragraphs and Documents , 2015, ACL.
[57] Ronan Collobert,et al. Phrase-based Image Captioning , 2015, ICML.
[58] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[59] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[60] Svetlana Lazebnik,et al. Improving Image-Sentence Embeddings Using Large Weakly Annotated Photo Collections , 2014, ECCV.
[61] Svetlana Lazebnik,et al. Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models , 2015, International Journal of Computer Vision.