暂无分享,去创建一个
Joshua B. Tenenbaum | Ping Luo | Mingyu Ding | Chuang Gan | Tao Du | Zhenfang Chen | J. Tenenbaum | Ping Luo | Tao Du | Chuang Gan | Mingyu Ding | Zhenfang Chen
[1] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[2] Razvan Pascanu,et al. Visual Interaction Networks: Learning a Physics Simulator from Video , 2017, NIPS.
[3] Trevor Darrell,et al. Explainable Neural Computation via Stack Neural Module Networks , 2018, ECCV.
[4] Marc Toussaint,et al. Differentiable Physics and Stable Modes for Tool-Use and Manipulation Planning , 2018, Robotics: Science and Systems.
[5] Chuang Gan,et al. CLEVRER: CoLlision Events for Video REpresentation and Reasoning , 2020, ICLR.
[6] Joshua B. Tenenbaum,et al. A Compositional Object-Based Approach to Learning Physical Dynamics , 2016, ICLR.
[7] Li Fei-Fei,et al. Learning Physical Graph Representations from Visual Scenes , 2020, NeurIPS.
[8] Sergey Levine,et al. Reasoning About Physical Interactions with Object-Oriented Prediction and Planning , 2018, ICLR.
[9] Ross B. Girshick,et al. PHYRE: A New Benchmark for Physical Reasoning , 2019, NeurIPS.
[10] Chen Sun,et al. VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[11] Long Chen,et al. Video Question Answering via Attribute-Augmented Attention Network Learning , 2017, SIGIR.
[12] Chuang Gan,et al. Visual Concept-Metaconcept Learning , 2020, NeurIPS.
[13] Martial Hebert,et al. Learning by Asking Questions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[14] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[15] Sergey Levine,et al. Unsupervised Learning for Physical Interaction through Video Prediction , 2016, NIPS.
[16] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Jiajun Wu,et al. Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning , 2015, NIPS.
[18] Deva Ramanan,et al. CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning , 2020, ICLR.
[19] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[20] Elise van der Pol,et al. Contrastive Learning of Structured World Models , 2020, ICLR.
[21] Li Fei-Fei,et al. CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Oleksandr Polozov,et al. Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning" , 2020, ICML.
[23] Jessica B. Hamrick,et al. Simulation as an engine of physical scene understanding , 2013, Proceedings of the National Academy of Sciences.
[24] Doug L. James,et al. Real time physics: class notes , 2008, SIGGRAPH '08.
[25] John C. Butcher,et al. A stability property of implicit Runge-Kutta methods , 1975 .
[26] Joshua B. Tenenbaum,et al. The ThreeDWorld Transport Challenge: A Visually Guided Task-and-Motion Planning Benchmark Towards Physically Realistic Embodied AI , 2021, 2022 International Conference on Robotics and Automation (ICRA).
[27] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[28] Jitendra Malik,et al. Finding action tubes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Louis-Philippe Morency,et al. Social-IQ: A Question Answering Benchmark for Artificial Social Intelligence , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Jiajun Wu,et al. Learning to See Physics via Visual De-animation , 2017, NIPS.
[31] Gaurav S. Sukhatme,et al. Interactive Differentiable Simulation , 2019, ArXiv.
[32] Chunhua Shen,et al. What Value Do Explicit High Level Concepts Have in Vision to Language Problems? , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.
[34] Abhinav Gupta,et al. Compositional Video Prediction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[35] Michael S. Bernstein,et al. Visual7W: Grounded Question Answering in Images , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Runhao Zeng,et al. Location-Aware Graph Convolutional Networks for Video Question Answering , 2020, AAAI.
[37] Jiajun Wu,et al. Entity Abstraction in Visual Model-Based Reinforcement Learning , 2019, CoRL.
[38] Christopher D. Manning,et al. GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[39] Li Fei-Fei,et al. Inferring and Executing Programs for Visual Reasoning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[40] Jiajun Wu,et al. Learning Compositional Koopman Operators for Model-Based Control , 2020, ICLR.
[41] Dan Klein,et al. Neural Module Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[43] Yueting Zhuang,et al. Video Question Answering via Gradually Refined Attention over Appearance and Motion , 2017, ACM Multimedia.
[44] Asim Kadav,et al. Hopper: Multi-hop Transformer for Spatiotemporal Reasoning , 2021, ICLR.
[45] Jiajun Wu,et al. Combining Physical Simulators and Object-Based Networks for Control , 2019, 2019 International Conference on Robotics and Automation (ICRA).
[46] Frédo Durand,et al. DiffTaichi: Differentiable Programming for Physical Simulation , 2020, ICLR.
[47] Felix Hill,et al. Object-based attention for spatio-temporal reasoning: Outperforming neuro-symbolic models with flexible distributed architectures , 2020, ArXiv.
[48] Razvan Pascanu,et al. Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.
[49] Jiajun Wu,et al. Propagation Networks for Model-Based Control Under Partial Observation , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[50] Chuang Gan,et al. Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding , 2018, NeurIPS.
[51] Joshua B. Tenenbaum,et al. PlasticineLab: A Soft-Body Manipulation Benchmark with Differentiable Physics , 2021, ICLR.
[52] Abhinav Gupta,et al. Interpretable Intuitive Physics Model , 2018, ECCV.
[53] Jonas Degrave,et al. A DIFFERENTIABLE PHYSICS ENGINE FOR DEEP LEARNING IN ROBOTICS , 2016, Front. Neurorobot..
[54] Deepak Pathak,et al. Learning Long-term Visual Dynamics with Region Proposal Interaction Networks , 2021, ICLR.
[55] Shunyu Yao,et al. Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations , 2019, NeurIPS.
[56] Qingming Huang,et al. Interpretable Visual Reasoning via Probabilistic Formulation Under Natural Supervision , 2020, ECCV.
[57] Georg Heigold,et al. Object-Centric Learning with Slot Attention , 2020, NeurIPS.
[58] Licheng Yu,et al. TVQA: Localized, Compositional Video Question Answering , 2018, EMNLP.
[59] Sanja Fidler,et al. gradSim: Differentiable simulation for system identification and visuomotor control , 2021, ICLR.
[60] Chuang Gan,et al. Beyond RNNs: Positional Self-Attention with Co-Attention for Video Question Answering , 2019, AAAI.
[61] Daniel L. K. Yamins,et al. Visual Grounding of Learned Physical Models , 2020, ICML.
[62] Christian Wolf,et al. COPHY: Counterfactual Learning of Physical Dynamics , 2020, ICLR.
[63] Emmanuel Dupoux,et al. IntPhys: A Framework and Benchmark for Visual Intuitive Physics Reasoning , 2018, ArXiv.
[64] Ming Lin,et al. Differentiable Physics Simulation , 2020, ICLR 2020.
[65] Max Welling,et al. Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.
[66] Yale Song,et al. TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[67] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[68] Joshua B. Tenenbaum,et al. Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning , 2021, ICLR.
[69] Joshua B. Tenenbaum,et al. End-to-End Differentiable Physics for Learning and Control , 2018, NeurIPS.
[70] Gaurav S. Sukhatme,et al. NeuralSim: Augmenting Differentiable Simulators with Neural Networks , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[71] Jorge Nocedal,et al. On the limited memory BFGS method for large scale optimization , 1989, Math. Program..
[72] Ali Farhadi,et al. "What Happens If..." Learning to Predict the Effect of Forces in Images , 2016, ECCV.
[73] Shu Zhang,et al. Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[74] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[75] Rob Fergus,et al. Learning Physical Intuition of Block Towers by Example , 2016, ICML.
[76] Chuang Gan,et al. The Neuro-Symbolic Concept Learner: Interpreting Scenes Words and Sentences from Natural Supervision , 2019, ICLR.
[77] Truyen Tran,et al. Hierarchical Conditional Relation Networks for Video Question Answering , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[78] Jitendra Malik,et al. Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.
[79] Sanja Fidler,et al. MovieQA: Understanding Stories in Movies through Question-Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[80] Chuang Gan,et al. ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation , 2020, ArXiv.
[81] Christopher D. Manning,et al. Compositional Attention Networks for Machine Reasoning , 2018, ICLR.
[82] Jiajun Wu,et al. Learning Particle Dynamics for Manipulating Rigid Bodies, Deformable Objects, and Fluids , 2018, ICLR.
[83] David Mascharka,et al. Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.