Active Cleaning for Video Corpus Annotation

In this paper, we have described the Active Cleaning approach that was used to complete the active learning approach in the TRECVID collaborative annotation. It consists of using a classification system to select the samples to be re-annotated in order to improve the quality of the annotations. We have evaluated the actual impact of our active cleaning approach on the TRECVID 2007 collection, using full annotations collected from the TRECVID collaborative annotations and the MCG-ICT-CAS annotations. From our experiments, a significant improvement of our annotation systems performance was observed when selecting a small fraction of samples to be re-annotated by our cleaning strategy, denoted as Cross-Val , than using the same fraction to annotate more new samples. Furthermore, it shows that higher performance can be reached with double annotations of 10% of negative samples, or 5% of all the annotated samples that were selected by the proposed cleaning strategy.

[1]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[2]  Stéphane Ayache,et al.  Evaluation of active learning strategies for video indexing , 2007, Signal Process. Image Commun..

[3]  Xian-Sheng Hua,et al.  Two-Dimensional Active Learning for image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[5]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[6]  Marcel Worring,et al.  Multimodal Video Indexing : A Review of the State-ofthe-art , 2001 .

[7]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[8]  Hervé Glotin,et al.  IRIM at TRECVID2009: High Level Feature Extraction , 2009 .

[9]  John R. Smith,et al.  On the detection of semantic concepts at TRECVID , 2004, MULTIMEDIA '04.

[10]  Dana Angluin,et al.  Queries and concept learning , 1988, Machine Learning.

[11]  Georges Quénot,et al.  Active learning with multiple classifiers for multimedia indexing , 2010, 2010 International Workshop on Content Based Multimedia Indexing (CBMI).

[12]  Nikolaos Papanikolopoulos,et al.  Multi-class active learning for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Kristen Grauman,et al.  Multi-Level Active Prediction of Useful Image Annotations for Recognition , 2008, NIPS.

[14]  Abhimanu Kumar Modeling Annotator Accuracies for Supervised Learning , 2011 .