暂无分享,去创建一个
Devi Parikh | Deshraj Yadav | Prithvijit Chattopadhyay | Viraj Prabhu | Arjun Chandrasekaran | Devi Parikh | Deshraj Yadav | Arjun Chandrasekaran | Prithvijit Chattopadhyay | Viraj Prabhu
[1] Ali Farhadi,et al. Predicting Failures of Vision Systems , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[2] Peter Robinson,et al. Mind reading machines: automated inference of cognitive mental states from video , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).
[3] Dan Klein,et al. Reasoning about Pragmatics with Neural Listeners and Speakers , 2016, EMNLP.
[4] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[5] S. Doyle-Lindrud,et al. Watson will see you now: a supercomputer to help clinicians make informed treatment decisions. , 2015, Clinical journal of oncology nursing.
[6] Antonio Torralba,et al. Who is Mistaken? , 2016, ArXiv.
[7] Bernard Ghanem,et al. What Makes an Object Memorable? , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[8] S. Baron-Cohen,et al. The "Reading the Mind in the Eyes" Test revised version: a study with normal adults, and adults with Asperger syndrome or high-functioning autism. , 2001, Journal of child psychology and psychiatry, and allied disciplines.
[9] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[10] Mario Fritz,et al. A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input , 2014, NIPS.
[11] Ramprasaath R. Selvaraju,et al. Counting Everyday Objects in Everyday Scenes , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[12] D. Meyer,et al. Supporting Online Material Materials and Methods Som Text Figs. S1 to S6 References Evidence for a Collective Intelligence Factor in the Performance of Human Groups , 2022 .
[13] G. Reeke. The society of mind , 1991 .
[14] Rebecca Q. Stafford,et al. Robots with Display Screens: A Robot with a More Humanlike Face Display Is Perceived To Have More Mind and a Better Personality , 2013, PloS one.
[15] Antonio Torralba,et al. Anticipating Visual Representations from Unlabeled Video , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Yash Goyal,et al. Towards Transparent AI Systems: Interpreting Visual Question Answering Models , 2016, 1608.08974.
[18] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Bernd Brügge,et al. Visual storytelling , 2015, EuroPLoP '13.
[20] Karl Stratos,et al. Understanding and predicting importance in images , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[21] Alex Pentland,et al. A Bayesian Computer Vision System for Modeling Human Interactions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..
[22] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[24] Koray Kavukcuoglu,et al. Multiple Object Recognition with Visual Attention , 2014, ICLR.
[25] Yash Goyal,et al. Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Jason P. Mitchell. Inferences about mental states , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.
[27] Christopher D. Manning,et al. Learning Language Games through Interaction , 2016, ACL.
[28] Meredith Ringel Morris,et al. Understanding Blind People's Experiences with Computer-Generated Captions of Social Media Images , 2017, CHI.
[29] Peter Robinson,et al. Real-Time Inference of Complex Mental States from Facial Expressions and Head Gestures , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.
[30] Ali Farhadi,et al. Towards Transparent Systems: Semantic Characterization of Failure Modes , 2014, ECCV.
[31] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[32] Trevor Darrell,et al. Generating Visual Explanations , 2016, ECCV.
[33] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[34] Dhruv Batra,et al. Human Attention in Visual Question Answering: Do Humans and Deep Networks look at the same regions? , 2016, EMNLP.
[35] C. Chabris,et al. Reading the Mind in the Eyes or Reading between the Lines? Theory of Mind Predicts Collective Intelligence Equally Well Online and Face-To-Face , 2014, PloS one.
[36] Gordon Christie,et al. Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions , 2016, EMNLP.
[37] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.
[38] Antonio Torralba,et al. Where are they looking? , 2015, NIPS.
[39] Brian Scassellati,et al. Theory of Mind for a Humanoid Robot , 2002, Auton. Robots.
[40] Kewei Tu,et al. Joint Video and Text Parsing for Understanding Events and Answering Queries , 2013, IEEE MultiMedia.
[41] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.
[42] Jiasen Lu,et al. Hierarchical Question-Image Co-Attention for Visual Question Answering , 2016, NIPS.
[43] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.
[44] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.
[45] M. Tomasello,et al. Great apes anticipate that other individuals will act according to false beliefs , 2016, Science.
[46] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[47] Marvin Minsky,et al. Society of Mind: A Response to Four Reviews , 1991, Artif. Intell..
[48] Trevor Darrell,et al. Attentive Explanations: Justifying Decisions and Pointing to the Evidence , 2016, ArXiv.
[49] Kate Saenko,et al. Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering , 2015, ECCV.
[50] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[51] Rob Miller,et al. VizWiz: nearly real-time answers to visual questions , 2010, UIST.
[52] Dhruv Batra,et al. Sort Story: Sorting Jumbled Images and Captions into Stories , 2016, EMNLP.
[53] Shaomei Wu,et al. Automatic Alt-text: Computer-generated Image Descriptions for Blind Users on a Social Network Service , 2017, CSCW.
[54] Barbara J. Grosz,et al. What Question Would Turing Pose Today? , 2012, AI Mag..
[55] Vicente Ordonez,et al. ReferItGame: Referring to Objects in Photographs of Natural Scenes , 2014, EMNLP.
[56] H. Wimmer,et al. Beliefs about beliefs: Representation and constraining function of wrong beliefs in young children's understanding of deception , 1983, Cognition.
[57] Antonio Torralba,et al. Predicting Motivations of Actions by Leveraging Text , 2014, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[58] Bilge Mutlu,et al. A Storytelling Robot: Modeling and Evaluation of Human-like Gaze Behavior , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.
[59] Bolei Zhou,et al. Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.
[60] David A. Ferrucci,et al. Introduction to "This is Watson" , 2012, IBM J. Res. Dev..
[61] M. Tomasello,et al. Does the chimpanzee have a theory of mind? 30 years later , 2008, Trends in Cognitive Sciences.
[62] Ramprasaath R. Selvaraju,et al. Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization , 2016 .
[63] Yoshua Bengio,et al. Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks , 2015, IEEE Transactions on Multimedia.
[64] Ashwin K. Vijayakumar,et al. We are Humor Beings: Understanding and Predicting Visual Humor , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[65] Wei Xu,et al. Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question , 2015, NIPS.
[66] 付伶俐. 打磨Using Language,倡导新理念 , 2014 .
[67] Dhruv Batra,et al. Analyzing the Behavior of Visual Question Answering Models , 2016, EMNLP.
[68] Hye-Won Song,et al. A Review of Computer Vision Methods for Purpose on Computer-Aided Diagnosis , 2016 .
[69] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.
[70] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[71] Mathias Broth,et al. Why That Nao?: How Humans Adapt to a Conventional Humanoid Robot in Taking Turns-at-Talk , 2016, CHI.
[72] Kristen Grauman,et al. Implied Feedback: Learning Nuances of User Behavior in Image Search , 2013, 2013 IEEE International Conference on Computer Vision.
[73] Susan R. Fussell,et al. How people anthropomorphize robots , 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[74] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[75] Erik T. Mueller,et al. Watson: Beyond Jeopardy! , 2013, Artif. Intell..
[76] S. Baron-Cohen. The evolution of a theory of mind. , 1999 .