暂无分享,去创建一个
[1] Alan L. Yuille. Towards a theory of compositional learning and encoding of objects , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).
[2] Fei-Fei Li,et al. Modeling mutual context of object and human pose in human-object interaction activities , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[3] Tat-Seng Chua,et al. Multiple Hypothesis Video Relation Detection , 2019, 2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM).
[4] Cordelia Schmid,et al. Temporal Localization of Actions with Actoms. , 2013, IEEE transactions on pattern analysis and machine intelligence.
[5] Shiliang Pu,et al. Video Relation Detection with Spatio-Temporal Graph , 2019, ACM Multimedia.
[6] Huchuan Lu,et al. Bi-Directional Relationship Inferring Network for Referring Image Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Cees G. M. Snoek,et al. Actor-Transformers for Group Activity Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Luc Van Gool,et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.
[9] Alexander Schwing,et al. Dynamic Neural Relational Inference , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Dietrich Paulus,et al. Simple online and realtime tracking with a deep association metric , 2017, 2017 IEEE International Conference on Image Processing (ICIP).
[11] Jiaxuan Wang,et al. HICO: A Benchmark for Recognizing Human-Object Interactions in Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[12] Jianping Fan,et al. NeXtVLAD: An Efficient Neural Network to Aggregate Frame-level Features for Large-scale Video Classification , 2018, ECCV Workshops.
[13] Mohan S. Kankanhalli,et al. Learning to Detect Human-Object Interactions With Knowledge , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[15] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[17] Larry S. Davis,et al. Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[18] Yadong Mu,et al. Beyond Short-Term Snippet: Video Relation Detection With Spatio-Temporal Global Context , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Jos B. T. M. Roerdink,et al. The Watershed Transform: Definitions, Algorithms and Parallelization Strategies , 2000, Fundam. Informaticae.
[20] Tat-Seng Chua,et al. Video Visual Relation Detection , 2017, ACM Multimedia.
[21] Ali Farhadi,et al. Video Relationship Reasoning Using Gated Spatio-Temporal Energy Graph , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Zhenzhong Chen,et al. Hierarchical Graph Attention Network for Visual Relationship Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[24] Yu Xiao,et al. Heterogeneous Non-Local Fusion for Multimodal Activity Recognition , 2020, ICMR.
[25] Andrew Zisserman,et al. Video Action Transformer Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Ivan Laptev,et al. Learnable pooling with Context Gating for video classification , 2017, ArXiv.
[27] Abhinav Gupta,et al. ActionVLAD: Learning Spatio-Temporal Aggregation for Action Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Tat-Seng Chua,et al. Video Relation Detection via Multiple Hypothesis Association , 2020, ACM Multimedia.
[29] Cordelia Schmid,et al. Stable Hyper-pooling and Query Expansion for Event Detection , 2013, 2013 IEEE International Conference on Computer Vision.
[30] Alan Yuille,et al. Compositional Convolutional Neural Networks: A Robust and Interpretable Model for Object Recognition Under Occlusion , 2020, International Journal of Computer Vision.
[31] Huaijiang Sun,et al. Learning Dynamic Relationships for 3D Human Motion Prediction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Marcel Worring,et al. Concept-Based Video Retrieval , 2009, Found. Trends Inf. Retr..
[33] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[34] Ivan Laptev,et al. Learning Interactions and Relationships Between Movie Characters , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Ganesh Ramakrishnan,et al. LIGHTEN: Learning Interactions with Graph and Hierarchical TEmporal Networks for HOI in videos , 2020, ACM Multimedia.
[36] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[37] David A. Shamma,et al. YFCC100M , 2015, Commun. ACM.
[38] Heyan Huang,et al. 3-D Relation Network for visual relation recognition in videos , 2021, Neurocomputing.
[39] Shizhe Chen,et al. Relation Understanding in Videos , 2019, ACM Multimedia.
[40] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[41] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Matthew J. Hausknecht,et al. Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[43] Xuming He,et al. Pose-Aware Multi-Level Feature Network for Human Object Interaction Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[44] Fahad Shahbaz Khan,et al. Learning Human-Object Interaction Detection Using Interaction Points , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Gangshan Wu,et al. Video Visual Relation Detection via Multi-modal Feature Fusion , 2019, ACM Multimedia.
[46] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[47] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[48] Guanghui Ren,et al. Video Relation Detection with Trajectory-aware Multi-modal Features , 2020, ACM Multimedia.
[49] Cewu Lu,et al. Transferable Interactiveness Knowledge for Human-Object Interaction Detection , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[50] Fei Wang,et al. PPDM: Parallel Point Detection and Matching for Real-Time Human-Object Interaction Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[51] Yu Cao,et al. Annotating Objects and Relations in User-Generated Videos , 2019, ICMR.
[52] Jia Deng,et al. Learning to Detect Human-Object Interactions , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).
[53] C. V. Jawahar,et al. Blocks That Shout: Distinctive Parts for Scene Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[54] Tatsuya Harada,et al. Bounding-Box Channels for Visual Relationship Detection , 2020, ECCV.
[55] Tao Hu,et al. Interactivity Proposals for Surveillance Videos , 2020, ICMR.
[56] Cordelia Schmid,et al. Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.