The MediaMill at TRECVID 2013: : Searching concepts, Objects, Instances and events in video
Abstract:In this paper we summarize our TRECVID 2014 [12] video retrieval experiments. The MediaMill team participated in five tasks: concept detection, object localization, instance search, event recognition and recounting. We experimented with concept detection using deep learning and color difference coding [17], object localization using FLAIR [23], instance search by one example [19], event recognition based on VideoStory [4], and event recounting using COSTA [10]. Our experiments focus on establishing the video retrieval value of these innovations. The 2014 edition of the TRECVID benchmark has again been a fruitful participation for the MediaMill team, resulting in the best result for concept detection and object localization.
暂无分享,去 创建一个
[1] Masoud Mazloom,et al. Querying for video events by semantic signatures from few examples , 2013, MM '13.
[2] Paul Over,et al. Creating HAVIC: Heterogeneous Audio Visual Internet Collection , 2012, LREC.
[3] Cees Snoek,et al. COSTA: Co-Occurrence Statistics for Zero-Shot Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[4] Arnold W. M. Smeulders,et al. Locality in Generic Instance Search from One Example , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[5] Cordelia Schmid,et al. Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[6] Xirong Li,et al. Evaluating sources and strategies for learning video concepts from social media , 2013, 2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI).
[7] Jiri Matas,et al. Efficient representation of local geometry for large scale object retrieval , 2009, CVPR.
[8] Koen E. A. van de Sande,et al. Recommendations for video event recognition using concept vocabularies , 2013, ICMR.
[9] Paul Over,et al. TRECVID 2008 - Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2010, TRECVID.
[10] Cees Snoek,et al. Recommendations for recognizing video events by concept vocabularies , 2014, Comput. Vis. Image Underst..
[11] Masoud Mazloom,et al. Searching informative concept banks for video event detection , 2013, ICMR.
[12] Koen E. A. van de Sande,et al. Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[13] Arnold W. M. Smeulders,et al. Visual-Concept Search Solved? , 2010, Computer.
[14] Kan Chen,et al. The 2013 SESAME Multimedia Event Detection and Recounting System , 2013, TRECVID.
[15] Marcel Worring,et al. Bootstrapping Visual Categorization With Relevant Negatives , 2013, IEEE Transactions on Multimedia.
[16] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[17] Georges Quénot,et al. TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.
[18] Cordelia Schmid,et al. Dense Trajectories and Motion Boundary Descriptors for Action Recognition , 2013, International Journal of Computer Vision.
[19] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[20] Subhransu Maji,et al. Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[21] Koen E. A. van de Sande,et al. Fisher and VLAD with FLAIR , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[22] Andrew Zisserman,et al. Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[23] Paul Over,et al. Evaluation campaigns and TRECVid , 2006, MIR '06.
[24] Dennis Koelma,et al. The MediaMill TRECVID 2008 Semantic Video Search Engine , 2008, TRECVID.
[25] Cees Snoek,et al. VideoStory: A New Multimedia Embedding for Few-Example Recognition and Translation of Events , 2014, ACM Multimedia.
[26] Florent Perronnin,et al. Modeling the spatial layout of images beyond spatial pyramids , 2012, Pattern Recognit. Lett..
[27] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[28] Marcel Worring,et al. Learning Social Tag Relevance by Neighbor Voting , 2009, IEEE Transactions on Multimedia.
[29] Yannis Avrithis,et al. To Aggregate or Not to aggregate: Selective Match Kernels for Image Search , 2013, 2013 IEEE International Conference on Computer Vision.
[30] Ramakant Nevatia,et al. Evaluating multimedia features and fusion for example-based event detection , 2013, Machine Vision and Applications.
[31] Cordelia Schmid,et al. Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[32] Masoud Mazloom,et al. Conceptlets: Selective Semantics for Classifying Video Events , 2014, IEEE Transactions on Multimedia.
[33] Stéphane Ayache,et al. Video Corpus Annotation Using Active Learning , 2008, ECIR.
[34] Thomas Mensink,et al. Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.
[35] Koen E. A. van de Sande,et al. Selective Search for Object Recognition , 2013, International Journal of Computer Vision.