Hand-Object Interaction Reasoning
暂无分享,去创建一个
Dima Damen | Jian Ma | D. Damen | Jian Ma
[1] Bolei Zhou,et al. Temporal Relational Reasoning in Videos , 2017, ECCV.
[2] David F. Fouhey,et al. Understanding Human Hands in Contact at Internet Scale , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Matthew J. Hausknecht,et al. Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Xuming He,et al. Pose-Aware Multi-Level Feature Network for Human Object Interaction Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[6] Dima Damen,et al. Scaling Egocentric Vision: The EPIC-KITCHENS Dataset , 2018, ArXiv.
[7] Mingmin Chi,et al. Relation Parsing Neural Network for Human-Object Interaction Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[8] Chuang Gan,et al. TSM: Temporal Shift Module for Efficient Video Understanding , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[9] Trevor Darrell,et al. Something-Else: Compositional Action Recognition With Spatial-Temporal Interaction Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Christoph Feichtenhofer,et al. X3D: Expanding Architectures for Efficient Video Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Abhinav Gupta,et al. Videos as Space-Time Region Graphs , 2018, ECCV.
[12] Jitendra Malik,et al. SlowFast Networks for Video Recognition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[13] Marie-Francine Moens,et al. Revisiting spatio-temporal layouts for compositional action recognition , 2021, BMVC.
[14] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[15] Jia Deng,et al. Learning to Detect Human-Object Interactions , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).
[16] Zheng Shou,et al. Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Chen Gao,et al. iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection , 2018, BMVC.
[18] Christian Wolf,et al. Object Level Visual Reasoning in Videos , 2018, ECCV.
[19] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[20] Kaiming He,et al. Detecting and Recognizing Human-Object Interactions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[21] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[22] Andrew Zisserman,et al. Video Action Transformer Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[24] Chao Li,et al. Collaborative Spatiotemporal Feature Learning for Video Action Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Andrew Zisserman,et al. Convolutional Two-Stream Network Fusion for Video Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[27] Sen Jia,et al. How Much Position Information Do Convolutional Neural Networks Encode? , 2020, ICLR.
[28] Gedas Bertasius,et al. Is Space-Time Attention All You Need for Video Understanding? , 2021, ICML.
[29] D. Damen,et al. Rescaling Egocentric Vision: Collection, Pipeline and Challenges for EPIC-KITCHENS-100 , 2020, International Journal of Computer Vision.
[30] Cewu Lu,et al. Transferable Interactiveness Knowledge for Human-Object Interaction Detection , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[32] Omri Bar,et al. Video Transformer Network , 2021, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).
[33] Kaiming He,et al. Long-Term Feature Banks for Detailed Video Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Luc Van Gool,et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.
[35] Amir Roshan Zamir,et al. Action Recognition in Realistic Sports Videos , 2014 .
[36] Susanne Westphal,et al. The “Something Something” Video Database for Learning and Evaluating Visual Common Sense , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[37] Cordelia Schmid,et al. Actor-Centric Relation Network , 2018, ECCV.
[38] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[39] Fabio Viola,et al. The Kinetics Human Action Video Dataset , 2017, ArXiv.
[40] Arman Cohan,et al. Longformer: The Long-Document Transformer , 2020, ArXiv.
[41] Alexander Kolesnikov,et al. MLP-Mixer: An all-MLP Architecture for Vision , 2021, NeurIPS.
[42] Chen Gao,et al. DRG: Dual Relation Graph for Human-Object Interaction Detection , 2020, ECCV.
[43] Cordelia Schmid,et al. AVA: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.