Query-Guided Regression Network with Context Policy for Phrase Grounding
暂无分享,去创建一个
[1] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[2] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[3] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[4] Kan Chen,et al. AMC: Attention Guided Multi-modal Correlation Learning for Image Search , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[6] Eric P. Xing,et al. Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[8] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .
[9] Rada Mihalcea,et al. Structured Matching for Phrase Localization , 2016, ECCV.
[10] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.
[11] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[12] Armand Joulin,et al. Deep Fragment Embeddings for Bidirectional Image Sentence Mapping , 2014, NIPS.
[13] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[14] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[16] Licheng Yu,et al. A Joint Speaker-Listener-Reinforcer Model for Referring Expressions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[18] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Rasmus Pagh,et al. Fast and scalable polynomial kernels via explicit feature maps , 2013, KDD.
[20] John Shawe-Taylor,et al. Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.
[21] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[22] Wei Xu,et al. ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering , 2015, ArXiv.
[23] Larry S. Davis,et al. Modeling Context Between Objects for Referring Expression Understanding , 2016, ECCV.
[24] Trevor Darrell,et al. Grounding of Textual Phrases in Images by Reconstruction , 2015, ECCV.
[25] Geoffrey Zweig,et al. From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Koen E. A. van de Sande,et al. Selective Search for Object Recognition , 2013, International Journal of Computer Vision.
[27] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.
[28] Vicente Ordonez,et al. ReferItGame: Referring to Objects in Photographs of Natural Scenes , 2014, EMNLP.
[29] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[30] C. Lawrence Zitnick,et al. Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.
[31] Trevor Darrell,et al. Natural Language Object Retrieval , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Trevor Darrell,et al. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.
[33] ZissermanAndrew,et al. The Pascal Visual Object Classes Challenge , 2015 .
[34] Ondrej Chum,et al. CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples , 2016, ECCV.
[35] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[36] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[37] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[38] Svetlana Lazebnik,et al. Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[39] Licheng Yu,et al. Modeling Context in Referring Expressions , 2016, ECCV.