Video2vec Embeddings Recognize Events When Examples Are Scarce
暂无分享,去创建一个
Cees Snoek | AmirHossein Habibian | Thomas Mensink | Cees G. M. Snoek | Thomas Mensink | A. Habibian
[1] Gang Hua,et al. Semantic Model Vectors for Complex Video Event Recognition , 2012, IEEE Transactions on Multimedia.
[2] Mubarak Shah,et al. Recognizing Complex Events Using Large Margin Joint Low-Level Event Model , 2012, ECCV.
[3] Dong Liu,et al. Event-Driven Semantic Concept Discovery by Exploiting Weakly Tagged Internet Images , 2014, ICMR.
[4] Teruko Mitamura,et al. Zero-Example Event Search using MultiModal Pseudo Relevance Feedback , 2014, ICMR.
[5] Jiebo Luo,et al. Utilizing semantic word similarity measures for video retrieval , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[6] Zhigang Ma,et al. From Concepts to Events: a Progressive Process for Multimedia content Analysis , 2013 .
[7] Kristen Grauman,et al. Relative attributes , 2011, 2011 International Conference on Computer Vision.
[8] Jason Weston,et al. WSABIE: Scaling Up to Large Vocabulary Image Annotation , 2011, IJCAI.
[9] Quan Wang,et al. Regularized latent semantic indexing , 2011, SIGIR.
[10] Cees Snoek,et al. Composite Concept Discovery for Zero-Shot Video Event Detection , 2014, ICMR.
[11] Mubarak Shah,et al. High-level event recognition in unconstrained videos , 2013, International Journal of Multimedia Information Retrieval.
[12] Ali Farhadi,et al. Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[13] Dong Liu,et al. Encoding Concept Prototypes for Video Event Detection and Summarization , 2015, ICMR.
[14] Shuang Wu,et al. Zero-Shot Event Detection Using Multi-modal Fusion of Weakly Supervised Concepts , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[15] Andrew Zisserman,et al. Learning Visual Attributes , 2007, NIPS.
[16] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.
[17] Roger Levy,et al. On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[18] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[19] Yi Yang,et al. Fast and Accurate Content-based Semantic Search in 100M Internet Videos , 2015, ACM Multimedia.
[20] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Alexander C. Berg,et al. Automatic Attribute Discovery and Characterization from Noisy Web Data , 2010, ECCV.
[22] Ruifan Li,et al. Cross-modal Retrieval with Correspondence Autoencoder , 2014, ACM Multimedia.
[23] Kilian Q. Weinberger,et al. Large Margin Taxonomy Embedding for Document Categorization , 2008, NIPS.
[24] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[25] Shih-Fu Chang,et al. Consumer video understanding: a benchmark database and an evaluation of human and machine performance , 2011, ICMR.
[26] H. Hotelling. Relations Between Two Sets of Variates , 1936 .
[27] Paul Over,et al. Creating HAVIC: Heterogeneous Audio Visual Internet Collection , 2012, LREC.
[28] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[29] Jean Ponce,et al. Sparse Modeling for Image and Vision Processing , 2014, Found. Trends Comput. Graph. Vis..
[30] Nicu Sebe,et al. Complex Event Detection via Multi-source Video Attributes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[31] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[32] Afshin Dehghan,et al. Improving Semantic Concept Detection through the Dictionary of Visually-Distinct Elements , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[33] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[34] A. G. Amitha Perera,et al. Multimedia event detection with multimodal feature fusion and temporal concept localization , 2013, Machine Vision and Applications.
[35] Pradipto Das,et al. Translating related words to videos and back through latent topics , 2013, WSDM.
[36] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[37] Patrick Bouthemy,et al. Better Exploiting Motion for Better Action Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[38] James Allan,et al. Zero-shot video retrieval using content and concepts , 2013, CIKM.
[39] Cordelia Schmid,et al. Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.
[40] Mubarak Shah,et al. Recognition of Complex Events: Exploiting Temporal Dynamics between Underlying Concepts , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[41] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[42] Cordelia Schmid,et al. Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[43] Christoph H. Lampert,et al. Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[44] Cees Snoek,et al. Recommendations for recognizing video events by concept vocabularies , 2014, Comput. Vis. Image Underst..
[45] Dong Liu,et al. Building A Large Concept Bank for Representing Events in Video , 2014, ArXiv.
[46] Thomas Hofmann,et al. Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..
[47] Shiguang Shan,et al. Informedia@TrecVID 2014: MED and MER , 2014 .
[48] Yann LeCun,et al. Convolutional Learning of Spatio-temporal Features , 2010, ECCV.
[49] Cees Snoek,et al. VideoStory: A New Multimedia Embedding for Few-Example Recognition and Translation of Events , 2014, ACM Multimedia.
[50] Cordelia Schmid,et al. Label-Embedding for Attribute-Based Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[51] Subhashini Venugopalan,et al. Translating Videos to Natural Language Using Deep Recurrent Neural Networks , 2014, NAACL.
[52] Cordelia Schmid,et al. Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[53] Georges Quénot,et al. TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.
[54] Geoffrey E. Hinton,et al. Zero-shot Learning with Semantic Output Codes , 2009, NIPS.
[55] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[56] Hui Cheng,et al. Evaluation of low-level features and their combinations for complex event detection in open source videos , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[57] Yi Yang,et al. A discriminative CNN video representation for event detection , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[58] Ramakant Nevatia,et al. DISCOVER: Discovering Important Segments for Classification of Video Events and Recounting , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[59] Cees Snoek,et al. Objects2action: Classifying and Localizing Actions without Any Video Example , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[60] Dong Liu,et al. EventNet: A Large Scale Structured Concept Library for Complex Event Detection in Video , 2015, ACM Multimedia.
[61] Cordelia Schmid,et al. Action and Event Recognition with Fisher Vectors on a Compact Feature Set , 2013, 2013 IEEE International Conference on Computer Vision.
[62] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[63] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[64] Thomas Hofmann,et al. Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.
[65] Larry S. Davis,et al. VRFP: On-the-Fly Video Retrieval Using Web Images and Fast Fisher Vector Products , 2015, IEEE Transactions on Multimedia.
[66] Yi Yang,et al. Semantic Concept Discovery for Large-Scale Zero-Shot Event Detection , 2015, IJCAI.
[67] Chenliang Xu,et al. A Thousand Frames in Just a Few Words: Lingual Description of Videos through Latent Topics and Sparse Object Stitching , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[68] Wei Chen,et al. Jointly Modeling Deep Video and Compositional Text to Bridge Vision and Language in a Unified Framework , 2015, AAAI.
[69] Thomas Mensink,et al. Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.
[70] Nuno Vasconcelos,et al. Dynamic Pooling for Complex Event Recognition , 2013, 2013 IEEE International Conference on Computer Vision.
[71] Michael I. Jordan,et al. Modeling annotated data , 2003, SIGIR.
[72] Shuang Wu,et al. Multimodal feature fusion for robust event detection in web videos , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[73] Dong Liu,et al. BBNVISER : BBN VISER TRECVID 2012 Multimedia Event Detection and Multimedia Event Recounting Systems , 2012, TRECVID.
[74] Cees Snoek,et al. UvA-DARE ( Digital Academic Repository ) Event Fisher Vectors : Robust Encoding Visual Diversity of Visual Streams , 2015 .
[75] Hagai Attias,et al. Topic regression multi-modal Latent Dirichlet Allocation for image annotation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[76] Fei-Fei Li,et al. Combining the Right Features for Complex Event Recognition , 2013, 2013 IEEE International Conference on Computer Vision.
[77] Mubarak Shah,et al. Columbia-UCF TRECVID2010 Multimedia Event Detection: Combining Multiple Modalities, Contextual Concepts, and Temporal Matching , 2010, TRECVID.
[78] Cordelia Schmid,et al. A Robust and Efficient Video Representation for Action Recognition , 2015, International Journal of Computer Vision.
[79] Nicu Sebe,et al. Complex Event Detection via Event Oriented Dictionary Learning , 2015, AAAI.
[80] Shiguang Shan,et al. Self-Paced Curriculum Learning , 2015, AAAI.
[81] Yi Yang,et al. Searching Persuasively: Joint Event Detection and Evidence Recounting with Limited Supervision , 2015, ACM Multimedia.
[82] Fei-Fei Li,et al. Learning latent temporal structure for complex event detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[83] Masoud Mazloom,et al. Conceptlets: Selective Semantics for Classifying Video Events , 2014, IEEE Transactions on Multimedia.