Weakly-Supervised Alignment of Video with Text
暂无分享,去创建一个
Cordelia Schmid | Jean Ponce | Ivan Laptev | Francis R. Bach | Edouard Grave | Piotr Bojanowski | Rémi Lajugie | F. Bach | J. Ponce | C. Schmid | I. Laptev | Edouard Grave | Piotr Bojanowski | Rémi Lajugie
[1] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[2] H. Hotelling. Relations Between Two Sets of Variates , 1936 .
[3] Philip Wolfe,et al. An algorithm for quadratic programming , 1956 .
[4] S. Chiba,et al. Dynamic programming algorithm optimization for spoken word recognition , 1978 .
[5] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.
[6] David A. Forsyth,et al. Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.
[7] David A. Forsyth,et al. Matching Words and Pictures , 2003, J. Mach. Learn. Res..
[8] John Shawe-Taylor,et al. Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.
[9] Jake K. Aggarwal,et al. Recognition of Composite Human Activities through Context-Free Grammar Based Representation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[10] Dale Schuurmans,et al. Convex Relaxations of Latent Variable Training , 2007, NIPS.
[11] David J. Kriegman,et al. Leveraging temporal, contextual and ordering constraints for recognizing complex activities in video , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[12] Zaïd Harchaoui,et al. DIFFRAC: a discriminative and flexible framework for clustering , 2007, NIPS.
[13] Cordelia Schmid,et al. Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[14] Cordelia Schmid,et al. Actions in context , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[15] Fernando De la Torre,et al. Canonical Time Warping for Alignment of Human Behavior , 2009, NIPS.
[16] Silvia Bernardini,et al. The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.
[17] Jean Ponce,et al. Discriminative clustering for image co-segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[18] Martial Hebert,et al. Modeling the Temporal Extent of Actions , 2010, ECCV.
[19] Fei-Fei Li,et al. Connecting modalities: Semi-supervised segmentation and annotation of images using unaligned text corpora , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[20] Cyrus Rashtchian,et al. Every Picture Tells a Story: Generating Sentences from Images , 2010, ECCV.
[21] Bohyung Han,et al. Scenario-based video event recognition by constraint flow , 2011, CVPR 2011.
[22] Cordelia Schmid,et al. Actom sequence models for efficient action detection , 2011, CVPR 2011.
[23] Vicente Ordonez,et al. Im2Text: Describing Images Using 1 Million Captioned Photographs , 2011, NIPS.
[24] Rainer Stiefelhagen,et al. “Knock! Knock! Who is it?” probabilistic person identification in TV-series , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[25] Larry S. Davis,et al. Combining Per-frame and Per-track Cues for Multi-person Action Recognition , 2012, ECCV.
[26] Jean Ponce,et al. Multi-class cosegmentation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[27] Bernt Schiele,et al. Grounding Action Descriptions in Videos , 2013, TACL.
[28] Marc'Aurelio Ranzato,et al. DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.
[29] Cordelia Schmid,et al. Finding Actors and Actions in Movies , 2013, 2013 IEEE International Conference on Computer Vision.
[30] Michael Isard,et al. A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics , 2012, International Journal of Computer Vision.
[31] HodoshMicah,et al. Framing image description as a ranking task , 2013 .
[32] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[33] Peter Young,et al. Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics , 2013, J. Artif. Intell. Res..
[34] Eamonn J. Keogh,et al. Addressing Big Data Time Series: Mining Trillions of Time Series Subsequences Under Dynamic Time Warping , 2013, TKDD.
[35] Cordelia Schmid,et al. Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.
[36] Mohamed R. Amer,et al. Monte Carlo Tree Search for Scheduling Activity Recognition , 2013, 2013 IEEE International Conference on Computer Vision.
[37] Bernt Schiele,et al. Translating Video Content to Natural Language Descriptions , 2013, 2013 IEEE International Conference on Computer Vision.
[38] Quoc V. Le,et al. Grounded Compositional Semantics for Finding and Describing Images with Sentences , 2014, TACL.
[39] Cordelia Schmid,et al. Weakly Supervised Action Labeling in Videos under Ordering Constraints , 2014, ECCV.
[40] Mihai Surdeanu,et al. The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.
[41] Fei-Fei Li,et al. Efficient Image and Video Co-localization with Frank-Wolfe Algorithm , 2014, ECCV.
[42] Armand Joulin,et al. Deep Fragment Embeddings for Bidirectional Image Sentence Mapping , 2014, NIPS.
[43] Fei-Fei Li,et al. Linking People in Videos with "Their" Names Using Coreference Resolution , 2014, ECCV.
[44] Francis R. Bach,et al. A Markovian approach to distributional semantics with application to semantic compositionality , 2014, COLING.
[45] Ronan Collobert,et al. Phrase-based Image Captioning , 2015, ICML.
[46] Rainer Stiefelhagen,et al. Book2Movie: Aligning video scenes with book chapters , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[47] Kevin Murphy,et al. What’s Cookin’? Interpreting Cooking Videos using Text, Speech and Vision , 2015, NAACL.
[48] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[49] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .