Binge Watching: Scaling Affordance Learning from Sitcoms
暂无分享,去创建一个
[1] Fei-Fei Li,et al. Modeling mutual context of object and human pose in human-object interaction activities , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[2] Hema Swetha Koppula,et al. Anticipating Human Activities Using Object Affordances for Reactive Robotic Response , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[3] Liang Lin,et al. Unconstrained Facial Landmark Localization with Backbone-Branches Fully-Convolutional Networks , 2015, ArXiv.
[4] Antonio Torralba,et al. Recognizing indoor scenes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[5] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Song-Chun Zhu,et al. Scene Parsing by Integrating Function, Geometry and Appearance Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[7] L. Stark,et al. Dissertation Abstract , 1994, Journal of Cognitive Education and Psychology.
[8] Derek Hoiem,et al. Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.
[9] Abhinav Gupta,et al. Marr Revisited: 2D-3D Alignment via Surface Normal Prediction , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[11] Abhinav Gupta,et al. The Curious Robot: Learning Visual Representations via Physical Interactions , 2016, ECCV.
[12] Yi Li,et al. R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.
[13] Bolei Zhou,et al. Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.
[14] Alexei A. Efros,et al. People Watching: Human Actions as a Cue for Single View Geometry , 2012, International Journal of Computer Vision.
[15] Tsuhan Chen,et al. Understanding images of groups of people , 2009, CVPR.
[16] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[17] Jianxiong Xiao,et al. A Linear Approach to Matching Cuboids in RGBD Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[18] Jitendra Malik,et al. Human Pose Estimation with Iterative Error Feedback , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Jitendra Malik,et al. Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.
[20] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[21] Song-Chun Zhu,et al. Inferring "Dark Matter" and "Dark Energy" from Videos , 2013, 2013 IEEE International Conference on Computer Vision.
[22] J. Gibson. The Ecological Approach to Visual Perception , 1979 .
[23] Ali Farhadi,et al. "What Happens If..." Learning to Predict the Effect of Forces in Images , 2016, ECCV.
[24] Michael R. Lowry,et al. Learning Physical Descriptions From Functional Definitions, Examples, and Precedents , 1983, AAAI.
[25] Iasonas Kokkinos,et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[26] Jia Deng,et al. Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.
[27] Luc Van Gool,et al. What makes a chair a chair? , 2011, CVPR 2011.
[28] Song-Chun Zhu,et al. Understanding tools: Task-oriented object modeling, learning and recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Azriel Rosenfeld,et al. Recognition by Functional Parts , 1995, Comput. Vis. Image Underst..
[30] Abhinav Gupta,et al. Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[31] Jianxiong Xiao,et al. SUN RGB-D: A RGB-D scene understanding benchmark suite , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Abhinav Gupta,et al. In Defense of the Direct Perception of Affordances , 2015, ArXiv.
[33] Alexei A. Efros,et al. Scene Semantics from Long-Term Observation of People , 2012, ECCV.
[34] Rob Fergus,et al. Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[35] Larry S. Davis,et al. Objects in Action: An Approach for Combining Action Understanding and Object Perception , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[36] Abhinav Gupta,et al. Learning a Predictable and Generative Vector Representation for Objects , 2016, ECCV.
[37] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[38] Alexei A. Efros,et al. From 3D scene geometry to human workspace , 2011, CVPR 2011.
[39] Yuandong Tian,et al. Single Image 3D Interpreter Network , 2016, ECCV.
[40] Lawrence D. Jackel,et al. Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.
[41] Li Fei-Fei,et al. Reasoning about Object Affordances in a Knowledge Base Representation , 2014, ECCV.
[42] Danica Kragic,et al. Simultaneous Visual Recognition of Manipulation Actions and Manipulated Objects , 2008, ECCV.
[43] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.