Google challenge: incremental-learning for web video categorization on robust semantic feature space

With the advent of video sharing websites, the amount of videos on the internet grows rapidly. Web video categorization is an efficient methodology to organize the huge amount of data. In this paper, we propose an effective web video categorization algorithm for the large scale dataset. It includes two factors: 1) For the great diversity of web videos, we develop an effective semantic feature space called Concept Collection for Web Video Categorization (CCWV-CD) to represent web videos, which consists of concepts with small semantic gap and high distinguishing ability. Meanwhile, the online Wikipedia API is employed to diffuse the concept correlations in this space. 2) We propose an incremental support vector machine with fixed number of support vectors (n-ISVM) to fit the large scale incremental learning problem in web video categorization. Extensive experiments are conducted on the dataset of 80024 most representative videos on YouTube demonstrate that the semantic space with Wikipedia prorogation is more representative for web videos, and n-ISVM outperforms other algorithms in efficiency when performs the incremental learning.