Character Grounding and Re-identification in Story of Videos and Text Descriptions
暂无分享,去创建一个
Gunhee Kim | Youngjae Yu | Jongseok Kim | Jiwan Chung | Heeseung Yun | Gunhee Kim | Youngjae Yu | Jongseok Kim | Heeseung Yun | Jiwan Chung
[1] Dietrich Paulus,et al. Simple online and realtime tracking with a deep association metric , 2017, 2017 IEEE International Conference on Image Processing (ICIP).
[2] Sanja Fidler,et al. Visual Semantic Search: Retrieving Videos via Complex Textual Queries , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[3] Gunhee Kim,et al. A Joint Sequence Fusion Model for Video Question Answering and Retrieval , 2018, ECCV.
[4] José M. F. Moura,et al. Visual Coreference Resolution in Visual Dialog using Neural Module Networks , 2018, ECCV.
[5] Andrew Zisserman,et al. From Benedict Cumberbatch to Sherlock Holmes: Character Identification in TV series without a Script , 2018, BMVC.
[6] Qi Tian,et al. Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[7] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[8] Wei Chen,et al. Jointly Modeling Deep Video and Compositional Text to Bridge Vision and Language in a Unified Framework , 2015, AAAI.
[9] Xiaogang Wang,et al. Person Search with Natural Language Description , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Dahua Lin,et al. Person Search in Videos with One Portrait Through Visual and Temporal Links , 2018, ECCV.
[11] Qi Tian,et al. MARS: A Video Benchmark for Large-Scale Person Re-Identification , 2016, ECCV.
[12] Xingyi Zhou,et al. Objects as Points , 2019, ArXiv.
[13] Sanja Fidler,et al. Order-Embeddings of Images and Language , 2015, ICLR.
[14] Andrew Zisserman,et al. “Who are you?” - Learning person specific classifiers from video , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[15] Seong Joon Oh,et al. Generating Descriptions with Grounded and Co-referenced People , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Xiaogang Wang,et al. Identity-Aware Textual-Visual Matching with Latent Co-attention , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[17] Timothy Dozat,et al. Universal Dependency Parsing from Scratch , 2019, CoNLL.
[18] O. Parkhi. It ’ s in the bag : Stronger supervision for automated face labelling , 2015 .
[19] Michael Jones,et al. An improved deep learning architecture for person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Leonid Sigal,et al. Learning Language-Visual Embedding for Movie Understanding with Natural-Language , 2016, ArXiv.
[21] Cordelia Schmid,et al. Finding Actors and Actions in Movies , 2013, 2013 IEEE International Conference on Computer Vision.
[22] Wei Jiang,et al. Bag of Tricks and a Strong Baseline for Deep Person Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[23] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[24] Fei-Fei Li,et al. Linking People in Videos with "Their" Names Using Coreference Resolution , 2014, ECCV.
[25] Yu Qiao,et al. Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.
[26] Ning Zhang,et al. Beyond frontal faces: Improving Person Recognition using multiple cues , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Nanning Zheng,et al. Person Re-identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Andrew Zisserman,et al. Hello! My name is... Buffy'' -- Automatic Naming of Characters in TV Video , 2006, BMVC.
[29] Bingbing Ni,et al. Learning Context Graph for Person Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Hai Tao,et al. Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.
[31] Xiaogang Wang,et al. Diversity Regularized Spatiotemporal Attention for Video-Based Person Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[32] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Seong Joon Oh,et al. Person Recognition in Personal Photo Collections , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[34] Christopher Joseph Pal,et al. Movie Description , 2016, International Journal of Computer Vision.
[35] Rita Cucchiara,et al. M-VAD names: a dataset for video captioning with naming , 2018, Multimedia Tools and Applications.
[36] Dahua Lin,et al. Unifying Identification and Context Learning for Person Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[37] Jianxin Wu,et al. Person Re-Identification with Correspondence Structure Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[38] Alessandro Perina,et al. Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[39] Ruslan Salakhutdinov,et al. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.
[40] Christopher Joseph Pal,et al. Using Descriptive Video Services to Create a Large Data Source for Video Annotation Research , 2015, ArXiv.
[41] Alan L. Yuille,et al. Generation and Comprehension of Unambiguous Object Descriptions , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Peter Young,et al. Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics , 2013, J. Artif. Intell. Res..
[43] Bohyung Han,et al. Visual Reference Resolution using Attention Memory for Visual Dialog , 2017, NIPS.
[44] Trevor Darrell,et al. Natural Language Object Retrieval , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Rainer Stiefelhagen,et al. “Knock! Knock! Who is it?” probabilistic person identification in TV-series , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[46] Xiaogang Wang,et al. DeepReID: Deep Filter Pairing Neural Network for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[47] Shiliang Zhang,et al. Pose-Driven Deep Convolutional Model for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[48] Trevor Darrell,et al. Grounding of Textual Phrases in Images by Reconstruction , 2015, ECCV.
[49] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[50] Richard I. Hartley,et al. Person Reidentification Using Spatiotemporal Appearance , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[51] Naokazu Yokoya,et al. Learning Joint Representations of Videos and Sentences with Web Image Search , 2016, ECCV Workshops.
[52] Q. Tian,et al. GLAD: Global-Local-Alignment Descriptor for Pedestrian Retrieval , 2017, ACM Multimedia.
[53] Stefanos Zafeiriou,et al. ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).