暂无分享,去创建一个
Matthew J. Hausknecht | Mohit Shridhar | Yonatan Bisk | Adam Trischler | Xingdi Yuan | Matthew Hausknecht | Marc-Alexandre Cot'e | Marc-Alexandre Côté | Adam Trischler | Yonatan Bisk | Xingdi Yuan | Mohit Shridhar | A. Trischler
[1] Li Fei-Fei,et al. DenseCap: Fully Convolutional Localization Networks for Dense Captioning , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Hannes Schulz,et al. Relevance of Unsupervised Metrics in Task-Oriented Dialogue for Evaluating Natural Language Generation , 2017, ArXiv.
[3] Bowen Zhou,et al. Pointing the Unknown Words , 2016, ACL.
[4] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[5] Jürgen Schmidhuber,et al. Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.
[6] Chelsea Finn,et al. Language as an Abstraction for Hierarchical Deep Reinforcement Learning , 2019, NeurIPS.
[7] Ali Farhadi,et al. Visual Semantic Planning Using Deep Successor Representations , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[8] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[10] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[11] Ali Farhadi,et al. IQA: Visual Question Answering in Interactive Environments , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[12] Malte Helmert,et al. The Fast Downward Planning System , 2006, J. Artif. Intell. Res..
[13] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[14] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[15] Cordelia Schmid,et al. VideoBERT: A Joint Model for Video and Language Representation Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[16] Benjamin Kuipers,et al. Walk the Talk: Connecting Language, Knowledge, and Action in Route Instructions , 2006, AAAI.
[17] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[18] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[19] Craig A. Knoblock,et al. PDDL-the planning domain definition language , 1998 .
[20] Romain Laroche,et al. Counting to Explore and Generalize in Text-based Games , 2018, ArXiv.
[21] Yuan-Fang Wang,et al. Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Edward Grefenstette,et al. The NetHack Learning Environment , 2020, NeurIPS.
[23] Stefan Lee,et al. Embodied Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[24] Licheng Yu,et al. MAttNet: Modular Attention Network for Referring Expression Comprehension , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[25] Jianqiang Huang,et al. Unbiased Scene Graph Generation From Biased Training , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Matthew J. Hausknecht,et al. Interactive Fiction Games: A Colossal Adventure , 2020, AAAI.
[27] Ali Farhadi,et al. AI2-THOR: An Interactive 3D Environment for Visual AI , 2017, ArXiv.
[28] Joshua B. Tenenbaum,et al. Deep Convolutional Inverse Graphics Network , 2015, NIPS.
[29] Qi Wu,et al. Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[30] Jiajun Wu,et al. Learning to See Physics via Visual De-animation , 2017, NIPS.
[31] Stefan Lee,et al. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks , 2019, NeurIPS.
[32] Lior Wolf,et al. Using the Output Embedding to Improve Language Models , 2016, EACL.
[33] Romain Laroche,et al. Learning Dynamic Belief Graphs to Generalize on Text-Based Games , 2020, NeurIPS.
[34] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[35] Yuandong Tian,et al. Hierarchical Decision Making by Generating and Following Natural Language Instructions , 2019, NeurIPS.
[36] Trevor Darrell,et al. Modeling Relationships in Referential Expressions with Compositional Modular Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Roozbeh Mottaghi,et al. ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Alexander M. Rush,et al. Bottom-Up Abstractive Summarization , 2018, EMNLP.
[39] Matthew J. Hausknecht,et al. TextWorld: A Learning Environment for Text-based Games , 2018, CGW@IJCAI.
[40] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.
[41] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[42] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[43] Matthew J. Hausknecht,et al. Graph Constrained Reinforcement Learning for Natural Language Action Spaces , 2020, ICLR.
[44] Li Fei-Fei,et al. CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).