Talk2Nav: Long-Range Vision-and-Language Navigation with Dual Attention and Spatial Memory
暂无分享,去创建一个
[1] Trevor Darrell,et al. Explainable Neural Computation via Stack Neural Module Networks , 2018, ECCV.
[2] Stephan Winter,et al. Structural Salience of Landmarks for Route Directions , 2005, COSIT.
[3] Jitendra Malik,et al. Visual Memory for Robust Path Following , 2018, NeurIPS.
[4] Andreas Geiger,et al. SphereNet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images , 2018, ECCV.
[5] Yoav Artzi,et al. TOUCHDOWN: Natural Language Navigation and Spatial Reasoning in Visual Street Environments , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Ali Farhadi,et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[7] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[8] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[9] Stephen Clark,et al. Understanding Grounded Language Learning Agents , 2017, ArXiv.
[10] M. Denis,et al. Language and spatial cognition: comparing the roles of landmarks and street names in route instructions , 2004 .
[11] Luc Van Gool,et al. Learning Accurate, Comfortable and Human-like Driving , 2019, ArXiv.
[12] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[13] Paul U. Lee,et al. Wayfinding choremes - a language for modeling conceptual route knowledge , 2005, J. Vis. Lang. Comput..
[14] Khanh Nguyen,et al. Vision-Based Navigation With Language-Based Assistance via Imitation Learning With Indirect Intervention , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Ilya Kostrikov,et al. PlaNet - Photo Geolocation with Convolutional Neural Networks , 2016, ECCV.
[16] Luc Van Gool,et al. Object Referring in Videos with Language and Human Gaze , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[17] Luc Van Gool,et al. Object Referring in Visual Scene with Spoken Language , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).
[18] John F. Canny,et al. Grounding Human-To-Vehicle Advice for Self-Driving Vehicles , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Raia Hadsell,et al. Learning To Follow Directions in Street View , 2019, AAAI.
[20] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[21] Ghassan Al-Regib,et al. The Regretful Agent: Heuristic-Aided Navigation Through Progress Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Gregory Shakhnarovich,et al. Discriminability Objective for Training Descriptive Captions , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[23] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Qi Wu,et al. Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[25] Raymond J. Mooney,et al. Learning to Interpret Natural Language Navigation Instructions from Observations , 2011, Proceedings of the AAAI Conference on Artificial Intelligence.
[26] Luc Van Gool,et al. Mapping, Localization and Path Planning for Image-Based Navigation Using Visual Features and Map , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Emily M. Bender. Linguistic I Ssues in L Anguage Technology Lilt on Achieving and Evaluating Language-independence in Nlp on Achieving and Evaluating Language-independence in Nlp , 2022 .
[28] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[29] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[30] Xiaogang Wang,et al. Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[32] Ali Farhadi,et al. Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[34] Samarth Brahmbhatt,et al. DeepNav: Learning to Navigate Large Cities , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Ruslan Salakhutdinov,et al. Generating Images from Captions with Attention , 2015, ICLR.
[36] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.
[37] Yuan-Fang Wang,et al. Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Alexander G. Schwing,et al. Convolutional Image Captioning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[39] Paul U. Lee,et al. Pictorial and Verbal Tools for Conveying Routes , 1999, COSIT.
[40] Toru Ishikawa,et al. Landmark Selection in the Environment: Relationships with Object Characteristics and Sense of Direction , 2012, Spatial Cogn. Comput..
[41] Stefan Lee,et al. Embodied Question Answering , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[42] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[43] Luc Van Gool,et al. End-to-End Learning of Driving Models with Surround-View Cameras and Route Planners , 2018, ECCV.
[44] Alexandra Millonig,et al. Developing Landmark-Based Pedestrian-Navigation Systems , 2007, IEEE Transactions on Intelligent Transportation Systems.
[45] Demis Hassabis,et al. Grounded Language Learning in a Simulated 3D World , 2017, ArXiv.
[46] Ali Farhadi,et al. IQA: Visual Question Answering in Interactive Environments , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[47] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[49] Raia Hadsell,et al. Learning to Navigate in Cities Without a Map , 2018, NeurIPS.
[50] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[51] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[52] Marie-Francine Moens,et al. Talk2Car: Taking Control of Your Self-Driving Car , 2019, EMNLP.
[53] Michel Denis,et al. Referring to Landmark or Street Information in Route Directions: What Difference Does It Make? , 2003, COSIT.
[54] T. Tenbrink,et al. Would you follow your own route description? Cognitive strategies in urban route planning , 2011, Cognition.
[55] Daniel Jurafsky,et al. Learning to Follow Navigational Directions , 2010, ACL.
[56] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.
[57] Byoungkwon An,et al. Looking Beyond the Visible Scene , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[58] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.
[59] Xin Wang,et al. Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation , 2018, ECCV.
[60] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[61] Emily M. Bender,et al. Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science , 2018, TACL.
[62] Siddhartha S. Srinivasa,et al. Tactical Rewind: Self-Correction via Backtracking in Vision-And-Language Navigation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[63] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[64] Maneesh Agrawala,et al. Automatic generation of tourist maps , 2008, ACM Trans. Graph..
[65] Rahul Sukthankar,et al. Cognitive Mapping and Planning for Visual Navigation , 2017, International Journal of Computer Vision.
[66] Silvio Savarese,et al. Translating Navigation Instructions in Natural Language to a High-Level Plan for Behavioral Robot Navigation , 2018, EMNLP.
[67] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[68] Dan Klein,et al. Speaker-Follower Models for Vision-and-Language Navigation , 2018, NeurIPS.
[69] Luc Van Gool,et al. Navigation using special buildings as signposts , 2014, MapInteract '14.
[70] Yann LeCun,et al. Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[71] Michel Denis,et al. When and Why Are Visual Landmarks Used in Giving Directions? , 2001, COSIT.
[72] Lixiang Li,et al. Captioning Transformer with Stacked Attention Modules , 2018 .
[73] Jitendra Malik,et al. On Evaluation of Embodied Navigation Agents , 2018, ArXiv.
[74] Trevor Darrell,et al. Localizing Moments in Video with Natural Language , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[75] Yale Song,et al. Video2GIF: Automatic Generation of Animated GIFs from Video , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[76] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.
[77] Jean Oh,et al. Grounding spatial relations for outdoor robot navigation , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).