Attention over Learned Object Embeddings Enables Complex Visual Reasoning
暂无分享,去创建一个
[1] Jun Liu,et al. SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Song-Chun Zhu,et al. ACRE: Abstract Causal REasoning Beyond Covariation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Asim Kadav,et al. Hopper: Multi-hop Transformer for Spatiotemporal Reasoning , 2021, ICLR.
[4] Nan Rosemary Ke,et al. Coordination Among Neural Modules Through a Shared Global Workspace , 2021, ICLR.
[5] Jiajun Wu,et al. Unsupervised Discovery of 3D Physical Objects from Video , 2020, ICLR.
[6] Thomas Kipf,et al. Object-Centric Learning with Slot Attention , 2020, NeurIPS.
[7] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[8] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.
[9] A. Globerson,et al. Learning Object Permanence from Video , 2020, ECCV.
[10] Markus N. Rabe,et al. Transformers Generalize to the Semantics of Logics , 2020 .
[11] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[12] Sungjin Ahn,et al. SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition , 2020, ICLR.
[13] Guillaume Lample,et al. Deep Learning for Symbolic Mathematics , 2019, ICLR.
[14] D. Ramanan,et al. CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning , 2019, ICLR.
[15] J. Tenenbaum,et al. CLEVRER: CoLlision Events for Video REpresentation and Reasoning , 2019, ICLR.
[16] Murray Shanahan,et al. Reconciling deep learning with symbolic artificial intelligence: representing objects and relations , 2019, Current Opinion in Behavioral Sciences.
[17] Andrew Zisserman,et al. Video Representation Learning by Dense Predictive Coding , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[18] Furu Wei,et al. VL-BERT: Pre-training of Generic Visual-Linguistic Representations , 2019, ICLR.
[19] Mohit Bansal,et al. LXMERT: Learning Cross-Modality Encoder Representations from Transformers , 2019, EMNLP.
[20] Cho-Jui Hsieh,et al. VisualBERT: A Simple and Performant Baseline for Vision and Language , 2019, ArXiv.
[21] Stefan Lee,et al. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks , 2019, NeurIPS.
[22] Aaron van den Oord,et al. Shaping Belief States with Generative Environment Models for RL , 2019, NeurIPS.
[23] Cordelia Schmid,et al. Contrastive Bidirectional Transformer for Temporal Representation Learning , 2019, ArXiv.
[24] Cordelia Schmid,et al. VideoBERT: A Joint Model for Video and Language Representation Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[25] James Demmel,et al. Large Batch Optimization for Deep Learning: Training BERT in 76 minutes , 2019, ICLR.
[26] Klaus Greff,et al. Multi-Object Representation Learning with Iterative Variational Inference , 2019, ICML.
[27] Matthew Botvinick,et al. MONet: Unsupervised Scene Decomposition and Representation , 2019, ArXiv.
[28] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[29] Chuang Gan,et al. Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding , 2018, NeurIPS.
[30] Razvan Pascanu,et al. Deep reinforcement learning with relational inductive biases , 2018, ICLR.
[31] Christopher D. Manning,et al. Compositional Attention Networks for Machine Reasoning , 2018, ICLR.
[32] Tomasz Kornuta,et al. Object-Based Reasoning in VQA , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).
[33] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[34] Razvan Pascanu,et al. Discovering objects and their relations from entangled scene representations , 2017, ICLR.
[35] Li Fei-Fei,et al. CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[37] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[38] Zhe Chen. Object-based attention: A tutorial review , 2012, Attention, Perception, & Psychophysics.
[39] David M. Sobel,et al. Detecting blickets: how young children use information about novel causal powers in categorization and induction. , 2000, Child development.
[40] Pieter R. Roelfsema,et al. Object-based attention in the primary visual cortex of the macaque monkey , 1998, Nature.
[41] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[42] Katherine D. Kinzler,et al. Core knowledge. , 2007, Developmental science.