暂无分享,去创建一个
Radu Soricut | Soravit Changpinyo | Bo Pang | Piyush Sharma | Radu Soricut | Piyush Sharma | Soravit Changpinyo | Bo Pang
[1] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[2] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[3] Trevor Darrell,et al. Large Scale Visual Recognition through Adaptation using Joint Representation and Multiple Instance Learning , 2016, J. Mach. Learn. Res..
[4] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[6] Chen Sun,et al. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[7] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[8] Jianwei Yang,et al. Neural Baby Talk , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[9] George A. Miller,et al. Introduction to WordNet: An On-line Lexical Database , 1990 .
[10] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[11] Christopher D. Manning,et al. GQA: a new dataset for compositional question answering over real-world images , 2019, ArXiv.
[12] C. Lawrence Zitnick,et al. CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[14] Peng Gao,et al. Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Kaiming He,et al. Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.
[16] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.
[17] Matti Pietikäinen,et al. Deep Learning for Generic Object Detection: A Survey , 2018, International Journal of Computer Vision.
[18] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[19] Xinlei Chen,et al. Pythia v0.1: the Winning Entry to the VQA Challenge 2018 , 2018, ArXiv.
[20] Fei Sha,et al. Aligning Where to See and What to Tell: Image Captioning with Region-Based Attention and Scene-Specific Contexts , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[21] Jonathon S. Hare,et al. Learning to Count Objects in Natural Images for Visual Question Answering , 2018, ICLR.
[22] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[23] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[24] Quoc V. Le,et al. DropBlock: A regularization method for convolutional networks , 2018, NeurIPS.
[25] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.
[26] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[27] Baoyuan Wu,et al. Tencent ML-Images: A Large-Scale Multi-Label Image Database for Visual Representation Learning , 2019, IEEE Access.
[28] Radu Soricut,et al. Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning , 2018, ACL.
[29] Jiebo Luo,et al. VizWiz Grand Challenge: Answering Visual Questions from Blind People , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[30] Richard S. Zemel,et al. Exploring Models and Data for Image Question Answering , 2015, NIPS.
[31] Xingyi Zhou,et al. Bottom-Up Object Detection by Grouping Extreme and Center Points , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Michael S. Bernstein,et al. Visual7W: Grounded Question Answering in Images , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[34] Trevor Darrell,et al. Captioning Images with Diverse Objects , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Luc Van Gool,et al. Domain Adaptive Faster R-CNN for Object Detection in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[36] Cordelia Schmid,et al. Areas of Attention for Image Captioning , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[37] Yuxing Tang,et al. Visual and Semantic Knowledge Transfer for Large Scale Semi-Supervised Object Detection , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[38] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[39] Zhen Li,et al. Graph-RISE: Graph-Regularized Image Semantic Embedding , 2019, ArXiv.
[40] Ali Farhadi,et al. YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Basura Fernando,et al. SPICE: Semantic Propositional Image Caption Evaluation , 2016, ECCV.
[42] Byoung-Tak Zhang,et al. Bilinear attention networks for VizWiz challenge , 2018 .
[43] Bohyung Han,et al. Transfer Learning via Unsupervised Task Discovery for Visual Question Answering , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Yash Goyal,et al. Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Chin-Yew Lin,et al. Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics , 2004, ACL.