Improving video concept detection through label space partitioning

We present an approach to video concept detection by building binary trees partitioning the label space, using visual and semantic similarity for multi-label datasets. The technique overcomes sparse annotations problem by increasing the number of positive examples per concept with the number of classifiers per concept, though sub-optimal, augmented too. We draw similarities between the proposed tree generation approach and Error Correcting Output Codes (ECOC) for multi-label classification and build ranked lists of video shots using weighted decoding or weighted tree traversal. We build a set of different trees based on the presented criterion each partitioning the label space in its own specific way. Inspired by the work in [1] we amass information from ensemble of trees to build the final ranked list, but using a different criterion. The classification resulting in ensemble error correction is complementary to One-vs-All classification and increases concept detection performance significantly on the TRECVID 2010 and 2013 datasets.

[1]  Sergio Escalera,et al.  An incremental node embedding technique for error correcting output codes , 2008, Pattern Recognit..

[2]  Shih-Fu Chang,et al.  Spherical hashing , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[4]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[5]  Ali Farhadi,et al.  Attribute Discovery via Predictable Discriminative Binary Codes , 2012, ECCV.

[6]  Bernard Mérialdo,et al.  Leveraging from group classification for video concept detection , 2013, 2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI).

[7]  A. Smeaton,et al.  TRECVID 2013 -- An Overview of the Goals, Tasks, Data, Evaluation Mechanisms, and Metrics | NIST , 2011 .

[8]  Przemyslaw Kazienko,et al.  Multi-label classification using error correcting output codes , 2012, Int. J. Appl. Math. Comput. Sci..

[9]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2007, ICML '07.

[10]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[11]  Hsuan-Tien Lin,et al.  Multi-label Classification with Error-correcting Codes , 2011, ACML.

[12]  David A. Forsyth,et al.  Large multi-class image categorization with ensembles of label trees , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[13]  Hsuan-Tien Lin,et al.  Multi-label Classication with Error-correcting Codes , 2011 .

[14]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[15]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[16]  Camelia Chira,et al.  Error-Correcting Output Codes for Multi-Label Text Categorization , 2012, IIR.

[17]  James Allan,et al.  A comparison of statistical significance tests for information retrieval evaluation , 2007, CIKM '07.

[18]  Sergio Escalera,et al.  Separability of ternary codes for sparse designs of error-correcting output codes , 2009, Pattern Recognit. Lett..