Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video
暂无分享,去创建一个
James M. Rehg | Yin Li | Miao Liu | Siyu Tang | James Rehg | Yin Li | Siyu Tang | Miao Liu
[1] Dima Damen,et al. Scaling Egocentric Vision: The EPIC-KITCHENS Dataset , 2018, ArXiv.
[2] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[3] Bingbing Ni,et al. Egocentric Activity Prediction via Event Modulated Attention , 2018, ECCV.
[4] Kris M. Kitani,et al. Going Deeper into First-Person Activity Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] James M. Rehg,et al. Paying More Attention to Motion: Attention Distillation for Learning Video Representations , 2019, ArXiv.
[6] Kristen Grauman,et al. Grounded Human-Object Interaction Hotspots From Video , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[7] Ramakant Nevatia,et al. RED: Reinforced Encoder-Decoder Networks for Action Anticipation , 2017, BMVC.
[8] Otmar Hilliges,et al. Structured Prediction Helps 3D Human Motion Modelling , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[9] Alexei A. Efros,et al. Scene Semantics from Long-Term Observation of People , 2012, ECCV.
[10] Martial Hebert,et al. Activity Forecasting , 2012, ECCV.
[11] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[12] Nanning Zheng,et al. Inferring Human Attention by Learning Latent Intentions , 2017, IJCAI.
[13] David J. Fleet,et al. Erratum: "Gaussian process dynamical models for human motion" (IEEE Transactions on Pattern analysis and Machine Intelligenc (292)) , 2008 .
[14] James M. Rehg,et al. In the Eye of Beholder: Joint Learning of Gaze and Actions in First Person Video , 2018, ECCV.
[15] David J. Fleet,et al. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Gaussian Process Dynamical Model , 2007 .
[16] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] L. Riggio,et al. The role of attention in the occurrence of the affordance effect. , 2008, Acta psychologica.
[18] G. Rizzolatti,et al. Activation of human primary motor cortex during action observation: a neuromagnetic study. , 1998, Proceedings of the National Academy of Sciences of the United States of America.
[19] Giovanni Maria Farinella,et al. Next-active-object prediction from egocentric videos , 2017, J. Vis. Commun. Image Represent..
[20] Jitendra Malik,et al. What will Happen Next? Forecasting Player Moves in Sports Videos , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[21] Heng Wang,et al. Video Classification With Channel-Separated Convolutional Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[22] Deva Ramanan,et al. Detecting activities of daily living in first-person camera views , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[23] Vladimir Pavlovic,et al. Learning Switching Linear Models of Human Motion , 2000, NIPS.
[24] Silvio Savarese,et al. Demo2Vec: Reasoning Object Affordances from Online Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[25] W. James,et al. The Principles of Psychology. , 1983 .
[26] Yin Li,et al. In the Eye of the Beholder: Gaze and Actions in First Person Video , 2021, IEEE transactions on pattern analysis and machine intelligence.
[27] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Larry H. Matthies,et al. Pooled motion features for first-person videos , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Martial Hebert,et al. The Pose Knows: Video Forecasting by Generating Pose Futures , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[30] Shmuel Peleg,et al. Compact CNN for indexing egocentric videos , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).
[31] Yun Fu,et al. Human Action Recognition and Prediction: A Survey , 2018, International Journal of Computer Vision.
[32] Yoichi Sato,et al. Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition , 2018, ECCV.
[33] Jianbo Shi,et al. Egocentric Future Localization , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Luc Van Gool,et al. What makes a chair a chair? , 2011, CVPR 2011.
[35] Giovanni Maria Farinella,et al. Leveraging Uncertainty to Rethink Loss Functions and Evaluation Measures for Egocentric Action Anticipation , 2018, ECCV Workshops.
[36] C. V. Jawahar,et al. First Person Action Recognition Using Deep Learned Descriptors , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[38] Jitendra Malik,et al. Recurrent Network Models for Human Dynamics , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[39] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[40] James M. Rehg,et al. Delving into egocentric actions , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[42] Jake K. Aggarwal,et al. Robot-Centric Activity Prediction from First-Person Videos: What Will They Do to Me? , 2015, 2015 10th ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[43] Zhuowen Tu,et al. Deeply Supervised Salient Object Detection with Short Connections , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Yunde Jia,et al. Parsing video events with goal inference and intent prediction , 2011, 2011 International Conference on Computer Vision.
[45] H. Spencer. The Principles of Psychology - Vol. I , 2016 .
[46] Giovanni Maria Farinella,et al. What Would You Expect? Anticipating Egocentric Actions With Rolling-Unrolling LSTMs and Modality Attention , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[47] G. Rizzolatti,et al. Understanding motor events: a neurophysiological study , 2004, Experimental Brain Research.
[48] M. Rushworth,et al. The left parietal and premotor cortices: motor attention and selection , 2003, NeuroImage.
[49] Bernt Schiele,et al. Time-Conditioned Action Anticipation in One Shot , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[50] José M. F. Moura,et al. Adversarial Geometry-Aware Human Motion Prediction , 2018, ECCV.
[51] Ali Farhadi,et al. Generating Notifications for Missing Actions: Don't Forget to Turn the Lights Off! , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[52] Nicholas Rhinehart,et al. Learning Action Maps of Large Environments via First-Person Vision , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[53] Abhinav Gupta,et al. Binge Watching: Scaling Affordance Learning from Sitcoms , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[54] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[55] Masaki Hayashi,et al. Recognition of Transitional Action for Short-Term Action Prediction using Discriminative Temporal CNN Feature , 2016, BMVC.
[56] Petros Daras,et al. Deep Affordance-Grounded Sensorimotor Object Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[57] Yoichi Sato,et al. Future Person Localization in First-Person Videos , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[58] Hema Swetha Koppula,et al. Anticipating Human Activities Using Object Affordances for Reactive Robotic Response , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[59] Bingbing Ni,et al. Cascaded Interactional Targeting Network for Egocentric Video Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[60] Silvio Savarese,et al. Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[61] Martial Hebert,et al. Cross-Stitch Networks for Multi-task Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[62] C. Urgesi,et al. Action anticipation and motor resonance in elite basketball players , 2008, Nature Neuroscience.
[63] Kristen Grauman,et al. Subjects and Their Objects: Localizing Interactees for a Person-Centric View of Importance , 2016, International Journal of Computer Vision.
[64] Ivan Laptev,et al. Leveraging the Present to Anticipate the Future in Videos , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[65] Nicholas Rhinehart,et al. First-Person Activity Forecasting with Online Inverse Reinforcement Learning , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[66] Antonio Torralba,et al. Anticipating Visual Representations from Unlabeled Video , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[67] David J. Fleet,et al. Topologically-constrained latent variable models , 2008, ICML '08.
[68] Cordelia Schmid,et al. AVA: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[69] Heng Wang,et al. Large-Scale Weakly-Supervised Pre-Training for Video Action Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[70] Ali Farhadi,et al. Understanding egocentric activities , 2011, 2011 International Conference on Computer Vision.