IQA: Visual Question Answering in Interactive Environments
暂无分享,去创建一个
Ali Farhadi | Dieter Fox | Mohammad Rastegari | Joseph Redmon | Daniel Gordon | Aniruddha Kembhavi | D. Fox | Joseph Redmon | Ali Farhadi | Mohammad Rastegari | Aniruddha Kembhavi | Daniel Gordon
[1] Richard Fikes,et al. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.
[2] Yoram Koren,et al. The vector field histogram-fast obstacle avoidance for mobile robots , 1991, IEEE Trans. Robotics Autom..
[3] G. Oriolo,et al. On-line map building and navigation for autonomous mobile robots , 1995, Proceedings of 1995 IEEE International Conference on Robotics and Automation.
[4] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[5] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[6] Simon Lacroix,et al. Reactive navigation in outdoor environments using potential fields , 1998, Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146).
[7] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[8] Ramakant Nevatia,et al. Symbolic Navigation with a Generic Map , 1999, Auton. Robots.
[9] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[10] Yoshiaki Shirai,et al. Autonomous visual navigation of a mobile robot using a human-guided experience , 2002, Robotics Auton. Syst..
[11] Andrew J. Davison,et al. Real-time simultaneous localisation and mapping with a single camera , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[12] Manuela M. Veloso,et al. Visual sonar: fast obstacle avoidance using monocular vision , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).
[13] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[14] Patrick Gros,et al. Robot motion control from a visual memory , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[15] Michel Dhome,et al. Outdoor autonomous navigation using monocular vision , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[16] Ashutosh Saxena,et al. High speed obstacle avoidance using monocular vision and reinforcement learning , 2005, ICML.
[17] Christos Dimitrakakis,et al. TORCS, The Open Racing Car Simulator , 2005 .
[18] Masahiro Tomono,et al. 3-D Object Map Building Using Dense Object Models with SIFT-based Recognition Features , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[19] David Wooden,et al. A guide to vision-based map building , 2006, IEEE Robotics & Automation Magazine.
[20] James J. Little,et al. Autonomous vision-based exploration and mapping using hybrid maps and Rao-Blackwellised particle filters , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[21] Parvaneh Saeedi,et al. Vision-based 3-D trajectory tracking for unknown environments , 2006, IEEE Transactions on Robotics.
[22] Nicholas Roy,et al. Trajectory Optimization using Reinforcement Learning for Map Exploration , 2008, Int. J. Robotics Res..
[23] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[24] Bernhard Nebel,et al. Integrating symbolic and geometric planning for mobile manipulation , 2009, 2009 IEEE International Workshop on Safety, Security & Rescue Robotics (SSRR 2009).
[25] Vincent Lepetit,et al. View-based Maps , 2010, Int. J. Robotics Res..
[26] David Vázquez,et al. Learning appearance in virtual scenarios for pedestrian detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[27] Leslie Pack Kaelbling,et al. Hierarchical task and motion planning in the now , 2011, 2011 IEEE International Conference on Robotics and Automation.
[28] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[30] Pieter Abbeel,et al. Using Classical Planners for Tasks with Continuous Operators in Robotics , 2013 .
[31] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[32] Kewei Tu,et al. Joint Video and Text Parsing for Understanding Events and Answering Queries , 2013, IEEE MultiMedia.
[33] Pieter Abbeel,et al. Combined task and motion planning through an extensible planner-independent interface layer , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[34] Mario Fritz,et al. A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input , 2014, NIPS.
[35] Paul Newman,et al. Scene Signatures: Localised and Point-less Features for Localisation , 2014, Robotics: Science and Systems.
[36] Daniel Cremers,et al. LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.
[37] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[38] Wei Xu,et al. Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question , 2015, NIPS.
[39] J. M. M. Montiel,et al. ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.
[40] Markus Schoeler,et al. Semantic Pose Using Deep Networks Trained on Synthetic RGB-D , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[41] Jiajun Wu,et al. Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning , 2015, NIPS.
[42] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[43] Mario Fritz,et al. Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[44] Jianxiong Xiao,et al. DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[45] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[46] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[47] Sanja Fidler,et al. MovieQA: Understanding Stories in Movies through Question-Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Kate Saenko,et al. Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering , 2015, ECCV.
[49] Peng Wang,et al. Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge from External Sources , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[50] Dan Klein,et al. Neural Module Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[51] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Nassir Navab,et al. Deeper Depth Prediction with Fully Convolutional Residual Networks , 2016, 2016 Fourth International Conference on 3D Vision (3DV).
[53] James J. Little,et al. Play and Learn: Using Video Games to Train Computer Vision Models , 2016, BMVC.
[54] Roberto Cipolla,et al. Understanding RealWorld Indoor Scenes with Synthetic Data , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[55] Marlos C. Machado,et al. State of the Art Control of Atari Games Using Shallow Reinforcement Learning , 2015, AAMAS.
[56] Ali Farhadi,et al. Newtonian Image Understanding: Unfolding the Dynamics of Objects in Static Images , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[57] Vladlen Koltun,et al. Playing for Data: Ground Truth from Computer Games , 2016, ECCV.
[58] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[59] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[60] Antonio M. López,et al. The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[61] Paul Newman,et al. Made to measure: Bespoke landmarks for 24-hour, all-weather localisation with a camera , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[62] Wojciech Jaskowski,et al. ViZDoom: A Doom-based AI research platform for visual reinforcement learning , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).
[63] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[64] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[65] Rob Fergus,et al. Learning Physical Intuition of Block Towers by Example , 2016, ICML.
[66] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[67] Michael S. Bernstein,et al. Visual7W: Grounded Question Answering in Images , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[68] Ali Farhadi,et al. A Diagram is Worth a Dozen Images , 2016, ECCV.
[69] Ali Farhadi,et al. "What Happens If..." Learning to Predict the Effect of Forces in Images , 2016, ECCV.
[70] Richard Socher,et al. Dynamic Memory Networks for Visual and Textual Question Answering , 2016, ICML.
[71] Jiasen Lu,et al. Hierarchical Question-Image Co-Attention for Visual Question Answering , 2016, NIPS.
[72] Trevor Darrell,et al. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.
[73] Kostas Daniilidis,et al. Fast, robust, continuous monocular egomotion computation , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[74] Bohyung Han,et al. MarioQA: Answering Questions by Watching Gameplay Videos , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[75] Li Fei-Fei,et al. CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[76] Chunhua Shen,et al. Explicit Knowledge-based Reasoning for Visual Question Answering , 2015, IJCAI.
[77] Stephen Clark,et al. Understanding Grounded Language Learning Agents , 2017, ArXiv.
[78] Li Fei-Fei,et al. Inferring and Executing Programs for Visual Reasoning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[79] Honglak Lee,et al. Communicating Hierarchical Neural Controllers for Learning Zero-shot Task Generalization , 2017 .
[80] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.
[81] Ali Farhadi,et al. Visual Semantic Planning Using Deep Successor Representations , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[82] Anil A. Bharath,et al. Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.
[83] Ali Farhadi,et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[84] Byoung-Tak Zhang,et al. DeepStory: Video Story QA by Deep Embedded Memory Networks , 2017, IJCAI.
[85] Qi Wu,et al. Visual question answering: A survey of methods and datasets , 2016, Comput. Vis. Image Underst..
[86] Ali Farhadi,et al. AI2-THOR: An Interactive 3D Environment for Visual AI , 2017, ArXiv.
[87] Yash Goyal,et al. Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[88] Trevor Darrell,et al. Learning to Reason: End-to-End Module Networks for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[89] Yale Song,et al. TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[90] Jonghyun Choi,et al. Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[91] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[92] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[93] Sergey Levine,et al. (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.
[94] R. Sarpong,et al. Bio-inspired synthesis of xishacorenes A, B, and C, and a new congener from fuscol† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c9sc02572c , 2019, Chemical science.
[95] 一樹 美添,et al. 5分で分かる! ? 有名論文ナナメ読み:Silver, D. et al. : Mastering the Game of Go without Human Knowledge , 2018 .
[96] Yoshua Bengio,et al. FigureQA: An Annotated Figure Dataset for Visual Reasoning , 2017, ICLR.
[97] Ruslan Salakhutdinov,et al. Gated-Attention Architectures for Task-Oriented Language Grounding , 2017, AAAI.
[98] Qi Wu,et al. FVQA: Fact-Based Visual Question Answering , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[99] Ali Farhadi,et al. YOLOv3: An Incremental Improvement , 2018, ArXiv.
[100] Joseph Redmon,et al. YOLOv3: An Incremental Improvement , 2018, ArXiv.
[101] Rahul Sukthankar,et al. Cognitive Mapping and Planning for Visual Navigation , 2017, International Journal of Computer Vision.
[102] Koren,et al. Real-Time Obstacle Avoidance for Fast Mobile Robots , 2022 .