CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication
暂无分享,去创建一个
Xinlei Chen | Yuandong Tian | Byoung-Tak Zhang | Marcus Rohrbach | Nikita Kitaev | Dhruv Batra | Devi Parikh | Jin-Hwa Kim | Nikita Kitaev | Xinlei Chen | Marcus Rohrbach | Yuandong Tian | Byoung-Tak Zhang | Dhruv Batra | Devi Parikh | Jin-Hwa Kim
[1] C. Lawrence Zitnick,et al. Bringing Semantics into Focus Using Visual Abstraction , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[2] Sanja Fidler,et al. MovieQA: Understanding Stories in Movies through Question-Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Dan Klein,et al. Unified Pragmatic Models for Generating and Following Instructions , 2017, NAACL.
[4] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[5] Jianfeng Gao,et al. Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation , 2017, IJCNLP.
[6] Quoc V. Le,et al. A Neural Conversational Model , 2015, ArXiv.
[7] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[8] Qi Wu,et al. Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[9] Michael S. Bernstein,et al. Visual7W: Grounded Question Answering in Images , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Percy Liang,et al. Learning Symmetric Collaborative Dialogue Agents with Dynamic Knowledge Graph Embeddings , 2017, ACL.
[11] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Olivier Pietquin,et al. End-to-end optimization of goal-driven and visually grounded dialogue systems , 2017, IJCAI.
[13] Yash Goyal,et al. Yin and Yang: Balancing and Answering Binary Visual Questions , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Matthew R. Walter,et al. Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences , 2015, AAAI.
[15] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.
[16] Devi Parikh,et al. It Takes Two to Tango: Towards Theory of AI's Mind , 2017, ArXiv.
[17] R. Kirk. CONVENTION: A PHILOSOPHICAL STUDY , 1970 .
[18] Jianfeng Gao,et al. Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.
[19] Richard S. Zemel,et al. Exploring Models and Data for Image Question Answering , 2015, NIPS.
[20] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[21] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[22] Jason Weston,et al. Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.
[23] Matthew R. Walter,et al. Navigational Instruction Generation as Inverse Reinforcement Learning with Neural Machine Translation , 2016, 2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI.
[24] Li Fei-Fei,et al. CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Stefan Lee,et al. Embodied Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[26] David Lewis. Convention: A Philosophical Study , 1986 .
[27] Kallirroi Georgila,et al. An ISU Dialogue System Exhibiting Reinforcement Learning of Dialogue Policies: Generic Slot-Filling in the TALK In-car System , 2006, EACL.
[28] Stefan Lee,et al. Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[29] Licheng Yu,et al. Visual Madlibs: Fill in the Blank Description Generation and Question Answering , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[30] José M. F. Moura,et al. Visual Dialog , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Yuandong Tian,et al. Simple Baseline for Visual Question Answering , 2015, ArXiv.
[32] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Xinlei Chen,et al. Mind's eye: A recurrent visual representation for image caption generation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Daniel Jurafsky,et al. Learning to Follow Navigational Directions , 2010, ACL.
[35] Richard Socher,et al. Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Hang Li,et al. Neural Responding Machine for Short-Text Conversation , 2015, ACL.
[37] Michael Silk,et al. Languages and Language , 2013 .
[38] Yann Dauphin,et al. Deal or No Deal? End-to-End Learning of Negotiation Dialogues , 2017, EMNLP.
[39] Wei Xu,et al. Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question , 2015, NIPS.
[40] Xiang Zhang,et al. Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems , 2015, ICLR.
[41] Jason Weston,et al. Learning through Dialogue Interactions by Asking Questions , 2016, ICLR.
[42] Dan Klein,et al. Speaker-Follower Models for Vision-and-Language Navigation , 2018, NeurIPS.
[43] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[44] Lawrence W. Barsalou,et al. Perceptions of perceptual symbols , 1999, Behavioral and Brain Sciences.
[45] Oliver Lemon,et al. A Simple and Generic Belief Tracking Mechanism for the Dialog State Tracking Challenge: On the believability of observed information , 2013, SIGDIAL Conference.
[46] Matthew R. Walter,et al. A framework for learning semantic maps from grounded natural language descriptions , 2014, Int. J. Robotics Res..
[47] José M. F. Moura,et al. Natural Language Does Not Emerge ‘Naturally’ in Multi-Agent Dialog , 2017, EMNLP.
[48] Christopher D. Manning,et al. Learning Language Games through Interaction , 2016, ACL.
[49] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[50] Dumitru Erhan,et al. Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[51] Ewan Klein,et al. Natural Language Processing with Python , 2009 .
[52] Jianfeng Gao,et al. A Neural Network Approach to Context-Sensitive Generation of Conversational Responses , 2015, NAACL.
[53] Matthew R. Walter,et al. Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation , 2011, AAAI.
[54] Hugo Larochelle,et al. GuessWhat?! Visual Object Discovery through Multi-modal Dialogue , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[55] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[56] Jason Weston,et al. Dialogue Learning With Human-In-The-Loop , 2016, ICLR.
[57] Percy Liang,et al. Simpler Context-Dependent Logical Forms via Model Projections , 2016, ACL.
[58] Mario Fritz,et al. A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input , 2014, NIPS.
[59] Lucy Vanderwende,et al. Learning the Visual Interpretation of Sentences , 2013, 2013 IEEE International Conference on Computer Vision.
[60] Christopher D. Manning,et al. Naturalizing a Programming Language via Interactive Learning , 2017, ACL.
[61] Yash Goyal,et al. Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).