Relay Boost Fusion for Learning Rare Concepts in Multimedia

This paper relates learning rare concepts for multimedia retrieval to a more general setting of imbalanced data. A Relay Boost (RL.Boost) algorithm is proposed to solve this imbalanced data problem by fusing multiple features extracted from the multimedia data. As a modified RankBoost algorithm, RL.Boost directly minimizes the ranking loss, rather than the classification error. RL.Boost also iteratively samples positive/negative pairs for a more balanced data set to get diverse weak ranking with different features, and combines them in a ranking ensemble. Experiments on the standard TRECVID 2005 benchmark data set show the effectiveness of the proposed algorithm.

[1]  Rong Yan,et al.  On predicting rare classes with SVM ensembles in scene classification , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[2]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[3]  Hendrik Blockeel,et al.  Knowledge Discovery in Databases: PKDD 2003 , 2003, Lecture Notes in Computer Science.

[4]  Nitesh V. Chawla,et al.  SMOTEBoost: Improving Prediction of the Minority Class in Boosting , 2003, PKDD.

[5]  John R. Smith,et al.  On the detection of semantic concepts at TRECVID , 2004, MULTIMEDIA '04.

[6]  Michael J. Pazzani,et al.  Reducing Misclassification Costs , 1994, ICML.

[7]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[8]  Salvatore J. Stolfo,et al.  AdaCost: Misclassification Cost-Sensitive Boosting , 1999, ICML.

[9]  Dong Xu,et al.  Columbia University TRECVID-2006 Video Search and High-Level Feature Extraction , 2006, TRECVID.

[10]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[11]  Milind R. Naphade,et al.  Learning the semantics of multimedia queries and concepts from a small number of examples , 2005, MULTIMEDIA '05.

[12]  Foster J. Provost,et al.  Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction , 2003, J. Artif. Intell. Res..

[13]  Mehryar Mohri,et al.  AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[14]  Vipin Kumar,et al.  Evaluating boosting algorithms to classify rare classes: comparison and improvements , 2001, Proceedings 2001 IEEE International Conference on Data Mining.