相关论文

Abstract:In this paper we summarize our TRECVID 2014 [12] video retrieval experiments. The MediaMill team participated in five tasks: concept detection, object localization, instance search, event recognition and recounting. We experimented with concept detection using deep learning and color difference coding [17], object localization using FLAIR [23], instance search by one example [19], event recognition based on VideoStory [4], and event recounting using COSTA [10]. Our experiments focus on establishing the video retrieval value of these innovations. The 2014 edition of the TRECVID benchmark has again been a fruitful participation for the MediaMill team, resulting in the best result for concept detection and object localization.

参考文献

[1]  Masoud Mazloom,et al.  Querying for video events by semantic signatures from few examples , 2013, MM '13.

[2]  Paul Over,et al.  Creating HAVIC: Heterogeneous Audio Visual Internet Collection , 2012, LREC.

[3]  Cees Snoek,et al.  COSTA: Co-Occurrence Statistics for Zero-Shot Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Arnold W. M. Smeulders,et al.  Locality in Generic Instance Search from One Example , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Xirong Li,et al.  Evaluating sources and strategies for learning video concepts from social media , 2013, 2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI).

[7]  Jiri Matas,et al.  Efficient representation of local geometry for large scale object retrieval , 2009, CVPR.

[8]  Koen E. A. van de Sande,et al.  Recommendations for video event recognition using concept vocabularies , 2013, ICMR.

[9]  Paul Over,et al.  TRECVID 2008 - Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2010, TRECVID.

[10]  Cees Snoek,et al.  Recommendations for recognizing video events by concept vocabularies , 2014, Comput. Vis. Image Underst..

[11]  Masoud Mazloom,et al.  Searching informative concept banks for video event detection , 2013, ICMR.

[12]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Arnold W. M. Smeulders,et al.  Visual-Concept Search Solved? , 2010, Computer.

[14]  Kan Chen,et al.  The 2013 SESAME Multimedia Event Detection and Recounting System , 2013, TRECVID.

[15]  Marcel Worring,et al.  Bootstrapping Visual Categorization With Relevant Negatives , 2013, IEEE Transactions on Multimedia.

[16]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[17]  Georges Quénot,et al.  TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.

[18]  Cordelia Schmid,et al.  Dense Trajectories and Motion Boundary Descriptors for Action Recognition , 2013, International Journal of Computer Vision.

[19]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[20]  Subhransu Maji,et al.  Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Koen E. A. van de Sande,et al.  Fisher and VLAD with FLAIR , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[24]  Dennis Koelma,et al.  The MediaMill TRECVID 2008 Semantic Video Search Engine , 2008, TRECVID.

[25]  Cees Snoek,et al.  VideoStory: A New Multimedia Embedding for Few-Example Recognition and Translation of Events , 2014, ACM Multimedia.

[26]  Florent Perronnin,et al.  Modeling the spatial layout of images beyond spatial pyramids , 2012, Pattern Recognit. Lett..

[27]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[28]  Marcel Worring,et al.  Learning Social Tag Relevance by Neighbor Voting , 2009, IEEE Transactions on Multimedia.

[29]  Yannis Avrithis,et al.  To Aggregate or Not to aggregate: Selective Match Kernels for Image Search , 2013, 2013 IEEE International Conference on Computer Vision.

[30]  Ramakant Nevatia,et al.  Evaluating multimedia features and fusion for example-based event detection , 2013, Machine Vision and Applications.

[31]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Masoud Mazloom,et al.  Conceptlets: Selective Semantics for Classifying Video Events , 2014, IEEE Transactions on Multimedia.

[33]  Stéphane Ayache,et al.  Video Corpus Annotation Using Active Learning , 2008, ECIR.

[34]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[35]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

引用
Deep Learning Based Imbalanced Data Classification and Information Retrieval for Multimedia Big Data
2018
Multimedia Pivot Tables for Multimedia Analytics on Image Collections
IEEE Transactions on Multimedia
2016
Insight in Image Collections by Multimedia Pivot Tables
ICMR
2015
Few-Shot Adaptation for Multimedia Semantic Indexing
ACM Multimedia
2018
Semantic Indexing for Large-Scale Video Retrieval
2016
Visual Learning of Socio-Video Semantics
2015
Topological Spatial Verification for Instance Search
IEEE Transactions on Multimedia
2015
Minimally Needed Evidence for Complex Event Recognition in Unconstrained Videos
ICMR
2014
Enhanced image and video representation for visual recognition
2014
Video Content Understanding Using Text
2020
Objects2action: Classifying and Localizing Actions without Any Video Example
2015 IEEE International Conference on Computer Vision (ICCV)
2015
Objects2action: Classifying and Localizing Actions without Any Video Example
2015 IEEE International Conference on Computer Vision (ICCV)
2015
Efficient Imbalanced Multimedia Concept Retrieval by Deep Learning on Spark Clusters
Int. J. Multim. Data Eng. Manag.
2017
Web-scale Multimedia Search for Internet Video Content
WSDM
2016
Super Fast Event Recognition in Internet Videos
IEEE Transactions on Multimedia
2015
Error-Driven Incremental Learning in Deep Convolutional Neural Network for Large-Scale Image Classification
ACM Multimedia
2014
Fast Coding of Feature Vectors Using Neighbor-to-Neighbor Search
IEEE Transactions on Pattern Analysis and Machine Intelligence
2016
Predicting Behavioural Patterns in Discussion Forums using Deep Learning on Hypergraphs
2019 International Conference on Content-Based Multimedia Indexing (CBMI)
2019
Integrating deep learning with correlation-based multimedia semantic concept detection
2015
Correlation-Based Deep Learning for Multimedia Semantic Concept Detection
WISE
2015