3-D Relation Network for visual relation recognition in videos
暂无分享,去创建一个
Heyan Huang | Xindi Shang | Tat-Seng Chua | Boran Wang | Qianwen Cao | Tat-Seng Chua | Heyan Huang | Boran Wang | Xindi Shang | Qianwen Cao
[1] Shu Zhang,et al. Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Bin Zhao,et al. HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[3] Christian Ledig,et al. Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Max Welling,et al. Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.
[5] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[6] Cordelia Schmid,et al. Actor-Centric Relation Network , 2018, ECCV.
[7] David A. Shamma,et al. YFCC100M , 2015, Commun. ACM.
[8] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[9] Michael S. Bernstein,et al. Visual Relationship Detection with Language Priors , 2016, ECCV.
[10] Ramakant Nevatia,et al. Motion-Appearance Co-memory Networks for Video Question Answering , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[11] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Rahul Sukthankar,et al. Rethinking the Faster R-CNN Architecture for Temporal Action Localization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[13] Shuicheng Yan,et al. Seq-NMS for Video Object Detection , 2016, ArXiv.
[14] Zhou Zhao,et al. Multi-interaction Network with Object Relation for Video Question Answering , 2019, ACM Multimedia.
[15] Klaus Schöffmann,et al. Collaborative Feature Maps for Interactive Video Search , 2017, MMM.
[16] Nicu Sebe,et al. Quantization-based hashing: a general framework for scalable image and video retrieval , 2018, Pattern Recognit..
[17] Tat-Seng Chua,et al. Video Visual Relation Detection , 2017, ACM Multimedia.
[18] Ali Farhadi,et al. Video Relationship Reasoning Using Gated Spatio-Temporal Energy Graph , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Yu Cao,et al. Annotating Objects and Relations in User-Generated Videos , 2019, ICMR.
[20] Yujie Wang,et al. Flow-Guided Feature Aggregation for Video Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[21] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Yuxin Peng,et al. Hierarchical Vision-Language Alignment for Video Captioning , 2018, MMM.
[23] Jun Yu,et al. On Exploring Undetermined Relationships for Visual Relationship Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Bodo Rosenhahn,et al. Natural Language Guided Visual Relationship Detection , 2017, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[25] Gangshan Wu,et al. Video Visual Relation Detection via Multi-modal Feature Fusion , 2019, ACM Multimedia.
[26] Cordelia Schmid,et al. Circulant Temporal Encoding for Video Retrieval and Temporal Alignment , 2015, International Journal of Computer Vision.
[27] Limin Wang,et al. Appearance-and-Relation Networks for Video Classification , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[28] Yang Wang,et al. Visual Relationship Detection Using Joint Visual-Semantic Embedding , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).
[29] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[30] Yejin Choi,et al. Neural Motifs: Scene Graph Parsing with Global Context , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[31] James J. Little,et al. Spatio-temporal Relational Reasoning for Video Question Answering , 2019, BMVC.
[32] Rui Caseiro,et al. High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[33] Тараса Шевченка,et al. Quo vadis? , 2013, Clinical chemistry.
[34] Wei Liu,et al. Leveraging Long-Range Temporal Relationships Between Proposals for Video Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[35] Shih-Fu Chang,et al. Visual Translation Embedding Network for Visual Relation Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Tao Mei,et al. Relation Distillation Networks for Video Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[37] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[38] Cordelia Schmid,et al. Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.
[39] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[40] Tsuyoshi Murata,et al. {m , 1934, ACML.
[41] Bohyung Han,et al. Streamlined Dense Video Captioning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[43] Sheng Liu,et al. SibNet: Sibling Convolutional Encoder for Video Captioning , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[44] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[45] Jiebo Luo,et al. Joint Commonsense and Relation Reasoning for Image and Video Captioning , 2020, AAAI.
[46] Ji Zhang,et al. Relationship Proposal Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[47] Tat-Seng Chua,et al. Relation Understanding in Videos: A Grand Challenge Overview , 2019, ACM Multimedia.
[48] Xiaogang Wang,et al. ViP-CNN: Visual Phrase Guided Convolutional Neural Network , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[49] Cordelia Schmid,et al. Learning to Track for Spatio-Temporal Action Localization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[50] Yongdong Zhang,et al. Learning Multimodal Attention LSTM Networks for Video Captioning , 2017, ACM Multimedia.
[51] Shiliang Pu,et al. Video Relation Detection with Spatio-Temporal Graph , 2019, ACM Multimedia.
[52] Yang Wang,et al. Video Summarization by Learning From Unpaired Data , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).