Structure-sensitive manifold ranking for video concept detection

Pairwise similarity of samples is an essential factor in graph propagation based semi-supervised learning methods. Usually it is estimated based on Euclidean distance. However, the structural assumption, which is a basic assumption in these methods, has not been taken into consideration in the normal pairwise similarity measure. In this paper, we propose a novel graph-based learning approach, named Structure-Sensitive Manifold Ranking (SSMR),based on a structure-sensitive similarity measure. Instead of using distance only, SSMR takes local distribution differences into account to more accurately measure pairwise similarity. Furthermore, we show that SSMR can also be deduced from a partial differential equation based anisotropic diffusion. Experiments conducted on the TRECVID dataset show that this approach significantly outperforms existing graph-based semi-supervised learning methods for video semantic concept detection.

[1]  Wei-Ying Ma,et al.  Graph based multi-modality learning , 2005, ACM Multimedia.

[2]  G. Sapiro,et al.  Geometric partial differential equations and image analysis [Book Reviews] , 2001, IEEE Transactions on Medical Imaging.

[3]  Martial Hebert,et al.  Semi-Supervised Self-Training of Object Detection Models , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[4]  Tong Zhang,et al.  The Value of Unlabeled Data for Classification Problems , 2000, ICML 2000.

[5]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[6]  Ronald Rosenfeld,et al.  Semi-supervised learning with graphs , 2005 .

[7]  Jingrui He,et al.  Generalized Manifold-Ranking-Based Image Retrieval , 2006, IEEE Transactions on Image Processing.

[8]  David G. Stork,et al.  Pattern Classification , 1973 .

[9]  Sally A. Goldman,et al.  MISSL: multiple-instance semi-supervised learning , 2006, ICML.

[10]  Meng Wang,et al.  Video annotation by graph-based learning with neighborhood similarity , 2007, ACM Multimedia.

[11]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[12]  Meng Wang,et al.  Automatic video annotation by semi-supervised learning with kernel density estimation , 2006, MM '06.

[13]  Tao Mei,et al.  Video annotation based on temporally consistent Gaussian random field , 2007 .

[14]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[15]  Meng Wang,et al.  Semi-automatic video annotation based on active learning with multiple complementary predictors , 2005, MIR '05.

[16]  Matthias Hein,et al.  Measure Based Regularization , 2003, NIPS.

[17]  Jingrui He,et al.  Manifold-ranking based image retrieval , 2004, MULTIMEDIA '04.

[18]  Paul Over,et al.  TRECVID 2005 - An Overview , 2005, TRECVID.

[19]  Rong Yan,et al.  Semi-supervised cross feature learning for semantic concept detection in videos , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Tao Mei,et al.  Anisotropic Manifold Ranking for Video Annotation , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[21]  Meng Wang,et al.  Manifold-ranking based video concept detection on large database and feature pool , 2006, MM '06.

[22]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[23]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[24]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[25]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[26]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[27]  Jitendra Malik,et al.  Scale-Space and Edge Detection Using Anisotropic Diffusion , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[29]  Sanjeev Khudanpur,et al.  Hidden Markov models for automatic annotation and content-based retrieval of images and video , 2005, SIGIR '05.

[30]  Changhu Wang,et al.  Image annotation refinement using random walk with restarts , 2006, MM '06.

[31]  R. Manmatha,et al.  Statistical models for automatic video annotation and retrieval , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.