EARLY FUSION for Goal Directed Robotic Vision
Yejin Choi | Dieter Fox | Yonatan Bisk | Yoav Artzi | Aaron Walsman | Dipendra Kumar Misra | Saadia Gabriel | D. Fox | Yejin Choi | Aaron Walsman | Yoav Artzi | Yonatan Bisk | Saadia Gabriel
[1] Daniel Marcu,et al. Natural Language Communication with Robots , 2016, NAACL.
[2] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[3] Koray Kavukcuoglu,et al. Visual Attention , 2020, Computational Models for Cognitive Vision.
[4] Li Fei-Fei,et al. CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Daniel Jurafsky,et al. Learning to Follow Navigational Directions , 2010, ACL.
[6] Demis Hassabis,et al. Grounded Language Learning in a Simulated 3D World , 2017, ArXiv.
[7] John Langford,et al. Mapping Instructions and Visual Observations to Actions with Reinforcement Learning , 2017, EMNLP.
[8] J. Wolfe,et al. Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.
[9] Rahul Sukthankar,et al. Cognitive Mapping and Planning for Visual Navigation , 2017, International Journal of Computer Vision.
[10] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[11] Christopher D. Manning,et al. Compositional Attention Networks for Machine Reasoning , 2018, ICLR.
[12] Alan L. Yuille,et al. Generation and Comprehension of Unambiguous Object Descriptions , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Dieter Fox,et al. Following directions using statistical machine translation , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[14] John K. Tsotsos,et al. Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..
[15] Kate Saenko,et al. Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering , 2015, ECCV.
[16] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[17] Vicente Ordonez,et al. ReferItGame: Referring to Objects in Photographs of Natural Scenes , 2014, EMNLP.
[18] Henrik I. Christensen,et al. Computational visual attention systems and their cognitive foundations: A survey , 2010, TAP.
[19] Thomas A. Funkhouser,et al. MINOS: Multimodal Indoor Simulator for Navigation in Complex Environments , 2017, ArXiv.
[20] Andrew Bennett,et al. CHALET: Cornell House Agent Learning Environment , 2018, ArXiv.
[21] Yuandong Tian,et al. Building Generalizable Agents with a Realistic and Rich 3D Environment , 2018, ICLR.
[22] Wei Xu,et al. ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering , 2015, ArXiv.
[23] Simone Frintrop,et al. Goal-Directed Search with a Top-Down Modulated Computational Attention System , 2005, DAGM-Symposium.
[24] Ali Farhadi,et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[25] Raymond J. Mooney,et al. Learning to Interpret Natural Language Navigation Instructions from Observations , 2011, Proceedings of the AAAI Conference on Artificial Intelligence.
[26] Jasdeep Singh,et al. Attention on Attention: Architectures for Visual Question Answering (VQA) , 2018, ArXiv.
[27] Ali Farhadi,et al. IQA: Visual Question Answering in Interactive Environments , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[28] Ross A. Knepper,et al. Mapping Navigation Instructions to Continuous Control Actions with Position-Visitation Prediction , 2018, CoRL.
[29] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.
[30] Ali Farhadi,et al. AI2-THOR: An Interactive 3D Environment for Visual AI , 2017, ArXiv.
[31] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.
[32] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[33] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[34] Wolfram Burgard,et al. Imitation learning with generalized task descriptions , 2009, 2009 IEEE International Conference on Robotics and Automation.
[35] Louis-Philippe Morency,et al. Using Syntax to Ground Referring Expressions in Natural Images , 2018, AAAI.
[36] Simon Brodeur,et al. HoME: a Household Multimodal Environment , 2017, ICLR.
[37] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[38] Qi Wu,et al. Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[39] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[40] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[41] Alexey Dosovitskiy,et al. End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[42] Ming Liu,et al. Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[43] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[44] Ross A. Knepper,et al. Following High-level Navigation Instructions on a Simulated Quadcopter with Imitation Learning , 2018, Robotics: Science and Systems.
[45] Andrew Bennett,et al. Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction , 2018, EMNLP.
[46] Siddhartha S. Srinivasa,et al. A System for Multi-step Mobile Manipulation: Architecture, Algorithms, and Experiments , 2016, ISER.
[47] Dan Klein,et al. Alignment-Based Compositional Semantics for Instruction Following , 2015, EMNLP.
[48] Stefan Lee,et al. Embodied Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[49] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[50] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[51] Oliver Brock,et al. Lessons from the Amazon Picking Challenge: Four Aspects of Building Robotic Systems , 2016, IJCAI.
[52] Jason Yosinski,et al. An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution , 2018, NeurIPS.
[53] Luke S. Zettlemoyer,et al. Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions , 2013, TACL.
[54] Matthew R. Walter,et al. Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation , 2011, AAAI.