A Thousand Frames in Just a Few Words: Lingual Description of Videos through Latent Topics and Sparse Object Stitching
暂无分享,去创建一个
Chenliang Xu | Pradipto Das | Jason J. Corso | Richard F. Doell | Chenliang Xu | P. Das | Pradipto Das
[1] Michael I. Jordan. Graphical Models , 2003 .
[2] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[3] Michael I. Jordan,et al. Modeling annotated data , 2003, SIGIR.
[4] Eduard H. Hovy,et al. Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.
[5] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..
[6] Anja Belz,et al. Comparing Automatic and Human Evaluation of NLG Systems , 2006, EACL.
[7] Fei-Fei Li,et al. Spatially Coherent Latent Topic Model for Concurrent Segmentation and Classification of Objects and Scenes , 2007, 2007 IEEE 11th International Conference on Computer Vision.
[8] Cordelia Schmid,et al. A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.
[9] Paul Over,et al. TRECVID 2008 - Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2010, TRECVID.
[10] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..
[11] Vladimir Pavlovic,et al. A New Baseline for Image Annotation , 2008, ECCV.
[12] Chong Wang,et al. Simultaneous image classification and annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[13] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.
[14] Cordelia Schmid,et al. TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[15] Shaogang Gong,et al. A Markov Clustering Topic Model for mining behaviour in video , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[16] Andrew McCallum,et al. Rethinking LDA: Why Priors Matter , 2009, NIPS.
[17] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[18] Koen E. A. van de Sande,et al. Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[19] Yansong Feng,et al. Topic Models for Image Annotation and Text Illustration , 2010, HLT-NAACL.
[20] Georges Quénot,et al. TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.
[21] Cyrus Rashtchian,et al. Collecting Image Annotations Using Amazon’s Mechanical Turk , 2010, Mturk@HLT-NAACL.
[22] Hagai Attias,et al. Topic regression multi-modal Latent Dirichlet Allocation for image annotation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[23] Cyrus Rashtchian,et al. Every Picture Tells a Story: Generating Sentences from Images , 2010, ECCV.
[24] Christoph H. Lampert,et al. Topic models for semantics-preserving video compression , 2010, MIR '10.
[25] Alexei A. Efros,et al. Unbiased look at dataset bias , 2011, CVPR 2011.
[26] Yejin Choi,et al. Baby talk: Understanding and generating simple image descriptions , 2011, CVPR 2011.
[27] Lei Zhang,et al. Towards coherent natural language description of video streams , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).
[28] Yiannis Aloimonos,et al. Corpus-Guided Sentence Generation of Natural Images , 2011, EMNLP.
[29] François Brémond,et al. Evaluation of Local Descriptors for Action Recognition in Videos , 2011, ICVS.
[30] Ali Farhadi,et al. Recognition using visual phrases , 2011, CVPR 2011.
[31] Sven J. Dickinson,et al. Video In Sentences Out , 2012, UAI.
[32] Deva Ramanan,et al. Efficiently Scaling up Crowdsourced Video Annotation , 2012, International Journal of Computer Vision.
[33] Bernt Schiele,et al. Script Data for Attribute-Based Recognition of Composite Activities , 2012, ECCV.
[34] Jason J. Corso,et al. Action bank: A high-level representation of activity in video , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[35] Kate Saenko,et al. Generating Natural-Language Video Descriptions Using Text-Mined Knowledge , 2013, AAAI.
[36] Pradipto Das,et al. Translating related words to videos and back through latent topics , 2013, WSDM.