Enhanced Semi-Supervised Learning for Automatic Video Annotation

For automatic semantic annotation of large-scale video database, the insufficiency of labeled training samples is a major obstacle. General semi-supervised learning algorithms can help solve the problem but the improvement is limited. In this paper, two semi-supervised learning algorithms, self-training and co-training, are enhanced by exploring the temporal consistency of semantic concepts in video sequences. In the enhanced algorithms, instead of individual shots, time-constraint shot clusters are taken as the basic sample units, in which most mis-classifications can be corrected before they are applied for re-training, thus more accurate statistical models can be obtained. Experiments show that enhanced self-training/co-training significantly improves the performance of video annotation

[1]  Meng Wang,et al.  Semi-automatic video annotation based on active learning with multiple complementary predictors , 2005, MIR '05.

[2]  Boon-Lock Yeo,et al.  Extracting story units from long programs for video browsing and navigation , 1996, Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems.

[3]  Fabio Gagliardi Cozman,et al.  Semi-supervised Learning of Classifiers : Theory , Algorithms and Their Application to Human-Computer Interaction , 2004 .

[4]  Nicu Sebe,et al.  Semisupervised learning of classifiers: theory, algorithms, and their application to human-computer interaction , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Edward Y. Chang,et al.  Confidence-based dynamic ensemble for image annotation and semantics discovery , 2003, MULTIMEDIA '03.

[6]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[7]  Pedro M. Domingos,et al.  Unifying Instance-Based and Rule-Based Induction , 1996 .

[8]  Ronald Rosenfeld,et al.  Semi-supervised learning with graphs , 2005 .

[9]  Pedro M. Domingos,et al.  Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier , 1996, ICML.

[10]  Stephan Raaijmakers,et al.  Multimodal topic segmentation and classification of news video , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[11]  Dell Zhang,et al.  Validating Co-Training Models for Web Image Classification , 2005 .

[12]  Shih-Fu Chang,et al.  Clustering methods for video browsing and annotation , 1996, Electronic Imaging.

[13]  Martial Hebert,et al.  Semi-Supervised Self-Training of Object Detection Models , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.