A Multimedia Retrieval Framework Based on Semi-Supervised Ranking and Relevance Feedback

We present a new framework for multimedia content analysis and retrieval which consists of two independent algorithms. First, we propose a new semi-supervised algorithm called ranking with Local Regression and Global Alignment (LRGA) to learn a robust Laplacian matrix for data ranking. In LRGA, for each data point, a local linear regression model is used to predict the ranking scores of its neighboring points. A unified objective function is then proposed to globally align the local models from all the data points so that an optimal ranking score can be assigned to each data point. Second, we propose a semi-supervised long-term Relevance Feedback (RF) algorithm to refine the multimedia data representation. The proposed long-term RF algorithm utilizes both the multimedia data distribution in multimedia feature space and the history RF information provided by users. A trace ratio optimization problem is then formulated and solved by an efficient algorithm. The algorithms have been applied to several content-based multimedia retrieval applications, including cross-media retrieval, image retrieval, and 3D motion/pose data retrieval. Comprehensive experiments on four data sets have demonstrated its advantages in precision, robustness, scalability, and computational efficiency.

[1]  Jitendra Malik,et al.  Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[2]  Léon Bottou,et al.  Local Learning Algorithms , 1992, Neural Computation.

[3]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[4]  Paul A. Viola,et al.  Boosting Image Retrieval , 2004, International Journal of Computer Vision.

[5]  Thomas S. Huang,et al.  Optimizing learning in image retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[6]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[7]  Yi Yang,et al.  Retrieval based interactive cartoon synthesis via unsupervised bi-distance metric learning , 2009, ACM Multimedia.

[8]  Yi Yang,et al.  Mining Semantic Correlation of Heterogeneous Multimedia Data for Cross-Media Retrieval , 2008, IEEE Transactions on Multimedia.

[9]  Feiping Nie,et al.  Trace Ratio Problem Revisited , 2009, IEEE Transactions on Neural Networks.

[10]  Zhongfei Zhang,et al.  Effective Image Retrieval Based on Hidden Concept Discovery in Image Database , 2007, IEEE Transactions on Image Processing.

[11]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[12]  Alberto Del Bimbo,et al.  Content-based retrieval of 3D models , 2006, TOMCCAP.

[13]  Xiaojin Zhu,et al.  Semi-Supervised Learning Literature Survey , 2005 .

[14]  Yi Yang,et al.  Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-Media Retrieval , 2008, IEEE Transactions on Multimedia.

[15]  Mohan S. Kankanhalli,et al.  Content-based music structure analysis with applications to music semantics understanding , 2004, MULTIMEDIA '04.

[16]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[17]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2006, IEEE Transactions on Knowledge and Data Engineering.

[18]  Zhuowen Tu,et al.  Learning Context-Sensitive Shape Similarity by Graph Transduction , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[20]  Chun Chen,et al.  Convex experimental design using manifold structure for image retrieval , 2009, MM '09.

[21]  Hongyuan Zha,et al.  Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment , 2002, ArXiv.

[22]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[23]  Kohji Fukunaga,et al.  Introduction to Statistical Pattern Recognition-Second Edition , 1990 .

[24]  Jing Huang,et al.  Image indexing using color correlograms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Samy Bengio,et al.  Large Scale Online Learning of Image Similarity Through Ranking , 2009, J. Mach. Learn. Res..

[26]  Qi Tian,et al.  Learning image manifolds by semantic subspace projection , 2006, MM '06.

[27]  Tido Röder,et al.  Efficient content-based retrieval of motion capture data , 2005, SIGGRAPH 2005.

[28]  Shang-Hua Teng,et al.  Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems , 2003, STOC '04.

[29]  Yi Yang,et al.  Image Clustering Using Local Discriminant Models and Global Integration , 2010, IEEE Transactions on Image Processing.

[30]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[31]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[32]  Jianping Fan,et al.  ClassView: hierarchical video shot classification, indexing, and accessing , 2004, IEEE Transactions on Multimedia.

[33]  Meinard Müller,et al.  Efficient content-based retrieval of motion capture data , 2005, SIGGRAPH '05.

[34]  Feiping Nie,et al.  Trace Ratio Criterion for Feature Selection , 2008, AAAI.

[35]  Bernhard Schölkopf,et al.  Transductive Classification via Local Learning Regularization , 2007, AISTATS.

[36]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[37]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[38]  Jiawei Han,et al.  Training Linear Discriminant Analysis in Linear Time , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[39]  Nuno Vasconcelos,et al.  A probabilistic architecture for content-based image retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[40]  Dong Xu,et al.  Trace Ratio vs. Ratio Trace for Dimensionality Reduction , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Bernhard Schölkopf,et al.  Ranking on Data Manifolds , 2003, NIPS.

[42]  B. Schölkopf,et al.  A Regularization Framework for Learning from Graph Data , 2004, ICML 2004.

[43]  Jiawei Han,et al.  Semi-supervised Discriminant Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[44]  Yi Yang,et al.  Ranking with local regression and global alignment for cross media retrieval , 2009, ACM Multimedia.

[45]  Wei-Ying Ma,et al.  Learning an image manifold for retrieval , 2004, MULTIMEDIA '04.

[46]  Bo Zhang,et al.  Support vector machine learning for image retrieval , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[47]  Jingrui He,et al.  Manifold-ranking based image retrieval , 2004, MULTIMEDIA '04.