Grounded Human-Object Interaction Hotspots From Video
暂无分享,去创建一个
Kristen Grauman | Tushar Nagarajan | Christoph Feichtenhofer | K. Grauman | Christoph Feichtenhofer | Tushar Nagarajan
[1] Yoichi Sato,et al. Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition , 2018, ECCV.
[2] Nikolaos G. Tsagarakis,et al. Object-based affordances detection with Convolutional Neural Networks and dense Conditional Random Fields , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[3] Ruben Villegas,et al. Learning to Generate Long-term Future via Hierarchical Prediction , 2017, ICML.
[4] Deva Ramanan,et al. Detecting activities of daily living in first-person camera views , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[5] Hema Swetha Koppula,et al. Anticipating Human Activities Using Object Affordances for Reactive Robotic Response , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[6] Kris M. Kitani,et al. Action-Reaction: Forecasting the Dynamics of Human Interaction , 2014, ECCV.
[7] Antonio Torralba,et al. Generating the Future with Adversarial Transformers , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Fei-Fei Li,et al. Grouplet: A structured image representation for recognizing human and object interactions , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[9] Alexei A. Efros,et al. People Watching: Human Actions as a Cue for Single View Geometry , 2012, International Journal of Computer Vision.
[10] Giovanni Maria Farinella,et al. Next-active-object prediction from egocentric videos , 2017, J. Vis. Commun. Image Represent..
[11] Larry S. Davis,et al. Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[12] Hema Swetha Koppula,et al. Physically Grounded Spatio-temporal Object Affordances , 2014, ECCV.
[13] Barbara Caputo,et al. Using Object Affordances to Improve Object Recognition , 2011, IEEE Transactions on Autonomous Mental Development.
[14] Luc Van Gool,et al. What makes a chair a chair? , 2011, CVPR 2011.
[15] Karl J. Friston,et al. Evidence of Mirror Neurons in Human Inferior Frontal Gyrus , 2009, The Journal of Neuroscience.
[16] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[17] Yiannis Aloimonos,et al. Affordance detection of tool parts from geometric features , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[18] Juergen Gall,et al. Adaptive Binarization for Weakly Supervised Affordance Segmentation , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).
[19] Frédo Durand,et al. What Do Different Evaluation Metrics Tell Us About Saliency Models? , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[20] Rita Cucchiara,et al. A deep multi-level network for saliency prediction , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).
[21] Alexei A. Efros,et al. From 3D scene geometry to human workspace , 2011, CVPR 2011.
[22] Nitish Srivastava,et al. Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.
[23] Noel E. O'Connor,et al. SalGAN: Visual Saliency Prediction with Generative Adversarial Networks , 2017, ArXiv.
[24] Chenfanfu Jiang,et al. Inferring Forces and Learning Human Utilities from Videos , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Pat Hanrahan,et al. SceneGrok: inferring action maps in 3D environments , 2014, ACM Trans. Graph..
[26] James M. Rehg,et al. Affordance Prediction via Learned Object Attributes , 2011 .
[27] Sergey Levine,et al. Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[28] Dima Damen,et al. You-Do, I-Learn: Discovering Task Relevant Objects and their Modes of Interaction from Multi-User Egocentric Video , 2014, BMVC.
[29] Dima Damen,et al. Scaling Egocentric Vision: The EPIC-KITCHENS Dataset , 2018, ArXiv.
[30] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[31] Salman Khan,et al. Visual Affordance and Function Understanding , 2018, ACM Comput. Surv..
[32] Alexei A. Efros,et al. Scene Semantics from Long-Term Observation of People , 2012, ECCV.
[33] Danica Kragic,et al. Visual object-action recognition: Inferring object affordances from human demonstration , 2011, Comput. Vis. Image Underst..
[34] Nicholas Rhinehart,et al. First-Person Activity Forecasting from Video with Online Inverse Reinforcement Learning , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[35] Larry S. Davis,et al. Objects in Action: An Approach for Combining Action Understanding and Object Perception , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[36] Antonio Torralba,et al. Generating Videos with Scene Dynamics , 2016, NIPS.
[37] Thomas Brox,et al. Striving for Simplicity: The All Convolutional Net , 2014, ICLR.
[38] Eric P. Xing,et al. Dual Motion GAN for Future-Flow Embedded Video Prediction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[39] Juergen Gall,et al. Weakly Supervised Affordance Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Sanja Fidler,et al. Learning to Act Properly: Predicting and Explaining Affordances from Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[41] Sergey Levine,et al. Unsupervised Learning for Physical Interaction through Video Prediction , 2016, NIPS.
[42] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[43] Petros Daras,et al. Deep Affordance-Grounded Sensorimotor Object Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Matthias Bethge,et al. DeepGaze II: Reading fixations from deep features trained on object recognition , 2016, ArXiv.
[45] James J. Gibson,et al. The Ecological Approach to Visual Perception: Classic Edition , 2014 .
[46] Sergey Levine,et al. Self-Supervised Visual Planning with Temporal Skip Connections , 2017, CoRL.
[47] Song-Chun Zhu,et al. Understanding tools: Task-oriented object modeling, learning and recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[49] Martial Hebert,et al. The Pose Knows: Video Forecasting by Generating Pose Futures , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[50] Kristen Grauman,et al. Subjects and Their Objects: Localizing Interactees for a Person-Centric View of Importance , 2016, International Journal of Computer Vision.
[51] Sinisa Todorovic,et al. A Multi-scale CNN for Affordance Segmentation in RGB Images , 2016, ECCV.
[52] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.
[53] Fei-Fei Li,et al. Discovering Object Functionality , 2013, 2013 IEEE International Conference on Computer Vision.
[54] Nikolaos G. Tsagarakis,et al. Detecting object affordances with Convolutional Neural Networks , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[55] Ivan Laptev,et al. Joint Discovery of Object States and Manipulation Actions , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[56] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[57] Darwin G. Caldwell,et al. AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[58] Alexei A. Efros,et al. Time-Agnostic Prediction: Predicting Predictable Video Frames , 2018, ICLR.
[59] Silvio Savarese,et al. Demo2Vec: Reasoning Object Affordances from Online Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[60] Marc'Aurelio Ranzato,et al. Video (language) modeling: a baseline for generative models of natural videos , 2014, ArXiv.
[61] Nicholas Rhinehart,et al. Learning Action Maps of Large Environments via First-Person Vision , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[62] Jiajun Wu,et al. Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks , 2016, NIPS.
[63] Hema Swetha Koppula,et al. Learning human activities and object affordances from RGB-D videos , 2012, Int. J. Robotics Res..
[64] Bingbing Ni,et al. Cascaded Interactional Targeting Network for Egocentric Video Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[65] Antonio Torralba,et al. Anticipating the future by watching unlabeled video , 2015, ArXiv.
[66] Yann LeCun,et al. Deep multi-scale video prediction beyond mean square error , 2015, ICLR.
[67] Abhinav Gupta,et al. Binge Watching: Scaling Affordance Learning from Sitcoms , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[68] Andrea Vedaldi,et al. AnchorNet: A Weakly Supervised Network to Learn Geometry-Sensitive Features for Semantic Matching , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[69] Bernt Schiele,et al. Functional Object Class Detection Based on Learned Affordance Cues , 2008, ICVS.
[70] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.