论文信息 - Videos Semantic Indexing using Image Classification

Videos Semantic Indexing using Image Classification

This notebook paper summarizes Team NEC-UIUC’s approaches for TRECVid 2010 Evaluation of Semantic Indexing. Our submissions mainly take advantage of advanced image classification methods using linear coordinate coding (LCC) of local features powered by the distributed computing software Hadoop. For every video shot, we evenly sample key frames and extract dense local features including DHOG and LBP, which are encoded by linear coordinate coding. Then, for every concept large-scale linear SVM classifiers are trained based on spatial pyramid of LCC features. Finally, we employ multiple instance learning to rank the video shots according to the SVM scores of individual frames. Our systems achieve mean extended inferred average precision (mean xinfAP) 7.40% for the 30 concepts evaluated by NIST and mean average precision 28.63% using 1/5 of the development data as the validation set for the total 130 concepts.

[1] Yihong Gong,et al. Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.

[2] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[3] Paul A. Viola,et al. Multiple Instance Boosting for Object Detection , 2005, NIPS.

[4] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .

[5] Chih-Jen Lin,et al. LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[6] Matti Pietikäinen,et al. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[7] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8] Chih-Jen Lin,et al. Large Linear Classification When Data Cannot Fit in Memory , 2011, TKDD.