Human-Centric Visual Relation Segmentation Using Mask R-CNN and VTransE
暂无分享,去创建一个
Fan Yu | Xin Tan | Gangshan Wu | Tongwei Ren
[1] Luis Herranz,et al. Image Captioning with both Object and Scene Information , 2016, ACM Multimedia.
[2] Jason Weston,et al. Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.
[3] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[4] Josephine Sullivan,et al. One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[5] Kaiming He,et al. Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[6] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[7] Tat-Seng Chua,et al. Video Visual Relation Detection , 2017, ACM Multimedia.
[8] Eric P. Xing,et al. Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Shih-Fu Chang,et al. Visual Translation Embedding Network for Visual Relation Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Michael S. Bernstein,et al. Visual Relationship Detection with Language Priors , 2016, ECCV.
[11] Min Xu,et al. Learning Multi-view Deep Features for Small Object Retrieval in Surveillance Scenarios , 2015, ACM Multimedia.
[12] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.