今日推荐

2014 - TRECVID

IRIM at TRECVID 2014: Semantic Indexing and Instance Search

Sabin Tiberius Strat Nicolas Ballas H. Jégou P. Gosselin B. Mérialdo M. Crucianu G. Quénot G. Chollet Joseph Razik H. Bredin S. Ayache Charles-Edmond Bichot H. Glotin Liming Chen D. Petrovska-Delacrétaz A. Stoian P. Lambert E. Dellandréa J. Delhumeau Chao Zhu M. Cord M. Redi Boris Mansencal H. Borgne J. Benois-Pineau A. Benoît Benjamin Labbé Yuxing Tang F. Thollard Ngoc-Trung Tran A. Shabou Bahjat Safadi N. Derbas Rémi Vieux Boyang Gao Abdelkader Hamadi S. Paris Miriam Redi Sébastion Paris

0 阅读

The IRIM group is a consortium of French teams supported by the GDR ISIS and working on Multimedia Indexing and Retrieval. This paper describes its participation to the TRECVID 2014 semantic indexing (SIN) and instance search (INS) tasks. For the semantic indexing task, our approach uses a six-stages processing pipelines for computing scores for the likelihood of a video shot to contain a target concept. These scores are then used for producing a ranked list of images or shots that are the most likely to contain the target concept. The pipeline is composed of the following steps: descriptor extraction, descriptor optimization, classification, fusion of descriptor variants, higher-level fusion, and re-ranking. We evaluated a number of different descriptors and tried different fusion strategies. The best IRIM run has a Mean Inferred Average Precision of 0.2796, which ranked us 5th out of 15 participants.For INS 2014 task IRIM participation, the classical BoW approach was followed, trained only with east-enders dataset. Shot signatures were computed on one key frame, or several key frames (at 1fps) and average pooling. A dissimilarity, computing a distance only for words present in query, was tested. A saliency map, build from object ROI to incorporate background context, was tried. Late fusion of two individual BoWresults, with different detectors/descriptors (Hessian-Affine/SIFT and Harris-Laplace/Opponent SIFT), was used. The four submitted runs were the following:- Run F_D_IRIM_1 was the late fusion of BOW with SIFT, dissimilarity L2p, on several key frames per shot, with context for queries, and BOW with Opponent SIFT, dissimilarity L1p, on one key frame per shot.- Run F_D_IRIM_2 was similar to F_D_IRIM_1 but context for queries used also for second BoW.- Run F_D_IRIM_3 was similar to F_D_IRIM_1 but no context for queries used.- Run F_D_IRIM_4 was similar to F_D_IRIM_2 but using delta1 dissimilarity [46] (from INS 2013 best run).We found that extracting several key frames per shot coupled with average pooling improved results. We confirmed than including context in queries was also beneficial. Surprisingly, our dissimilarity performed better than delta1.

2015 - TRECVID

ITI-CERTH participation to TRECVID 2015

Yiannis Kompatsiaris Anastasios Dimou Vasileios Mezaris Anastasia Moumtzidou Stefanos Vrochidis Nikolaos Gkalelis

0 阅读

This paper provides an overview of the tasks submitted to TRECVID 2011 by ITI-CERTH. ITICERTH participated in the Known-item search (KIS) as well as in the Semantic Indexing (SIN) and the Event Detection in Internet Multimedia (MED) tasks. In the SIN task, techniques are developed, which combine motion information with existing well-performing descriptors such as SURF, Random Forests and Bag-of-Words for shot representation. In the MED task, the trained concept detectors of the SIN task are used to represent video sources with model vector sequences, then a dimensionality reduction method is used to derive a discriminant subspace for recognizing events, and, nally, SVMbased event classiers are used to detect the underlying video events. The KIS search task is performed by employing VERGE, which is an interactive retrieval application combining retrieval functionalities in various modalities and exploiting implicit user feedback.

论文关键词

neural network sensor network wireless sensor network wireless sensor deep learning comparative study base station information retrieval feature extraction sensor node programming language cellular network random field digital video number theory rate control network lifetime river basin hyperspectral imaging distributed algorithm chemical reaction carnegie mellon university fly ash visual feature boundary detection video retrieval diabetes mellitu semantic indexing oryza sativa water storage user association efficient wireles shot boundary shot boundary detection data assimilation system retrieval task controlled trial terrestrial television video search gps network sensor network consist efficient wireless sensor information retrieval task concept detection video captioning retrieval evaluation rice seed safety equipment endangered species station operation case study involving dublin city university high-level feature seed germination brown coal high plain study involving structure recognition climate experiment gravity recovery table structure land data assimilation instance search combinatorial number randomised controlled trial recovery and climate randomised controlled combinatorial number theory adult male high-level feature extraction complete proof music perception robust computation optimization-based method perception and cognition global land datum social perception terrestrial water storage trec video retrieval terrestrial water object-oriented conceptual video retrieval evaluation trec video seed variety base station operation table structure recognition transgenic rice concept detector total water storage groundwater storage regional gp grace gravity randomized distributed algorithm ibm tivoli workload scheduler cerebrovascular accident case study united state