Stop-Frame Removal Improves Web Video Classification

Web videos available in sharing sites like YouTube, are becoming an alternative to manually annotated training data, which are necessary for creating video classifiers. However, when looking into web videos, we observe they contain several irrelevant frames that may randomly appear in any video, i.e., blank and over exposed frames. We call these irrelevant frames stop-frames and propose a simple algorithm to identify and exclude them during classifier training. Stop-frames might appear in any video, so it is hard to recognize their category. Therefore we identify stop-frames as those frames, which are commonly misclassified by any concept classifier. Our experiments demonstrates that using our algorithm improves classification accuracy by 60% and 24% in terms of mean average precision for an event and concept detection benchmark.

[1]  Adrian Ulges,et al.  A System That Learns to Tag Videos by Watching Youtube , 2008, ICVS.

[2]  Tao Mei,et al.  Multi-layer multi-instance kernel for video concept detection , 2007, ACM Multimedia.

[3]  Xirong Li,et al.  Evaluating sources and strategies for learning video concepts from social media , 2013, 2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI).

[4]  Meng Wang,et al.  Automatic video annotation by semi-supervised learning with kernel density estimation , 2006, MM '06.

[5]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[6]  Adrian Ulges,et al.  Identifying relevant frames in weakly labeled videos for training concept detectors , 2008, CIVR '08.

[7]  Yueting Zhuang,et al.  Adaptive key frame extraction using unsupervised clustering , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[8]  Christopher J. Fox,et al.  A stop list for general text , 1989, SIGF.

[9]  Paul Over,et al.  The trecvid 2007 BBC rushes summarization evaluation pilot , 2007, TVS '07.