A Graph-Based Framework to Bridge Movies and Synopses
暂无分享,去创建一个
Bolei Zhou | Dahua Lin | Qingqiu Huang | Hang Zhou | Yu Xiong | Lingfeng Guo | Bolei Zhou | Dahua Lin | Yu Xiong | Qingqiu Huang | Hang Zhou | Lingfeng Guo
[1] Dahua Lin,et al. Person Search in Videos with One Portrait Through Visual and Temporal Links , 2018, ECCV.
[2] Bernard Ghanem,et al. ActivityNet: A large-scale video benchmark for human activity understanding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[3] David J. Fleet,et al. VSE++: Improving Visual-Semantic Embeddings with Hard Negatives , 2017, BMVC.
[4] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[5] Harold W. Kuhn,et al. The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.
[6] Yiannis Kompatsiaris,et al. Near-Duplicate Video Retrieval with Deep Metric Learning , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).
[7] Xirong Li,et al. Dual Encoding for Zero-Example Video Retrieval , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Rainer Stiefelhagen,et al. Book2Movie: Aligning video scenes with book chapters , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Dahua Lin,et al. From Trailers to Storylines: An Efficient Way to Learn from Movies , 2018, ArXiv.
[10] Yi Yang,et al. DevNet: A Deep Event Network for multimedia event detection and evidence recounting , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Ning Zhang,et al. Beyond frontal faces: Improving Person Recognition using multiple cues , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Yin Li,et al. Learning Deep Structure-Preserving Image-Text Embeddings , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Xun Wang,et al. Dual Dense Encoding for Zero-Example Video Retrieval , 2018, ArXiv.
[14] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Gunhee Kim,et al. A Joint Sequence Fusion Model for Video Question Answering and Retrieval , 2018, ECCV.
[16] Luc Van Gool,et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.
[17] Tao Mei,et al. MSR-VTT: A Large Video Description Dataset for Bridging Video and Language , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Gunhee Kim,et al. A Read-Write Memory Network for Movie Story Understanding , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[19] Jongwook Choi,et al. End-to-End Concept Word Detection for Video Captioning, Retrieval, and Question Answering , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Christopher D. Manning,et al. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.
[21] Cristian Sminchisescu,et al. Deep Learning of Graph Matching , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[22] Juan Carlos Niebles,et al. Dense-Captioning Events in Videos , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[23] Jitendra Malik,et al. Shape matching and object recognition using low distortion correspondences , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[24] Cordelia Schmid,et al. AVA: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[25] Sanja Fidler,et al. MovieQA: Understanding Stories in Movies through Question-Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Fernando De la Torre,et al. Factorized Graph Matching , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[27] Sanja Fidler,et al. Visual Semantic Search: Retrieving Videos via Complex Textual Queries , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[28] Trevor Darrell,et al. Localizing Moments in Video with Natural Language , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[29] Kate Saenko,et al. Multilevel Language and Vision Integration for Text-to-Clip Retrieval , 2018, AAAI.
[30] Andrew Zisserman,et al. From Benedict Cumberbatch to Sherlock Holmes: Character Identification in TV series without a Script , 2018, BMVC.
[31] Sanja Fidler,et al. MovieGraphs: Towards Understanding Human-Centric Situations from Videos , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[32] Chuan-Sheng Foo,et al. Holistic Multi-Modal Memory Network for Movie Question Answering , 2018, IEEE Transactions on Image Processing.
[33] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Bernt Schiele,et al. A dataset for Movie Description , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Yiannis Kompatsiaris,et al. Temporal Video Segmentation to Scenes Using High-Level Audiovisual Features , 2011, IEEE Transactions on Circuits and Systems for Video Technology.
[36] Christopher Potts,et al. The Life and Death of Discourse Entities: Identifying Singleton Mentions , 2013, NAACL.
[37] Yu Qiao,et al. Find and Focus: Retrieve and Localize Video Events with Natural Language Queries , 2018, ECCV.
[38] Kai Chen,et al. MMDetection: Open MMLab Detection Toolbox and Benchmark , 2019, ArXiv.
[39] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[40] Dahua Lin,et al. Unifying Identification and Context Learning for Person Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[41] Michael S. Bernstein,et al. Image retrieval using scene graphs , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).