Cross-media retrieval using query dependent search methods

The content-based cross-media retrieval is a new type of multimedia retrieval in which the media types of query examples and the returned results can be different. In order to learn the semantic correlations among multimedia objects of different modalities, the heterogeneous multimedia objects are analyzed in the form of multimedia document (MMD), which is a set of multimedia objects that are of different media types but carry the same semantics. We first construct an MMD semi-semantic graph (MMDSSG) by jointly analyzing the heterogeneous multimedia data. After that, cross-media indexing space (CMIS) is constructed. For each query, the optimal dimension of CMIS is automatically determined and the cross-media retrieval is performed on a per-query basis. By doing this, the most appropriate retrieval approach for each query is selected, i.e. different search methods are used for different queries. The query dependent search methods make cross-media retrieval performance not only accurate but also stable. We also propose different learning methods of relevance feedback (RF) to improve the performance. Experiment is encouraging and validates the proposed methods.

[1]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[2]  Remco C. Veltkamp,et al.  A Survey of Music Information Retrieval Systems , 2005, ISMIR.

[3]  Yi Yang,et al.  Mining Semantic Correlation of Heterogeneous Multimedia Data for Cross-Media Retrieval , 2008, IEEE Transactions on Multimedia.

[4]  David Malakoff DOE Softens Bite of Tighter Security Rules at Labs , 2000, Science.

[5]  Jianping Fan,et al.  ClassView: hierarchical video shot classification, indexing, and accessing , 2004, IEEE Transactions on Multimedia.

[6]  Joshua B. Tenenbaum,et al.  The Isomap Algorithm and Topological Stability , 2002, Science.

[7]  Karl Aberer,et al.  On the efficient evaluation of relaxed queries in biological databases , 2002, CIKM '02.

[8]  Myron Wish,et al.  Three-Way Multidimensional Scaling , 1978 .

[9]  Thomas S. Huang,et al.  Optimizing learning in image retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[10]  Yi Yang,et al.  Retrieval based interactive cartoon synthesis via unsupervised bi-distance metric learning , 2009, ACM Multimedia.

[11]  Hayit Greenspan,et al.  Probabilistic space-time video modeling via piecewise GMM , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Guodong Guo,et al.  Content-based audio classification and retrieval by support vector machines , 2003, IEEE Trans. Neural Networks.

[13]  Yi Yang,et al.  Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-Media Retrieval , 2008, IEEE Transactions on Multimedia.

[14]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[15]  Alberto Del Bimbo,et al.  Content-based retrieval of 3D models , 2006, TOMCCAP.

[16]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[17]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[18]  Zhongfei Zhang,et al.  Effective Image Retrieval Based on Hidden Concept Discovery in Image Database , 2007, IEEE Transactions on Image Processing.

[19]  Bart Thomee,et al.  Relevance feedback: perceptual learning and retrieval in bio-computing, photos, and video , 2004, MIR '04.

[20]  Tido Röder,et al.  Efficient content-based retrieval of motion capture data , 2005, SIGGRAPH 2005.

[21]  Yi Yang,et al.  Ranking with local regression and global alignment for cross media retrieval , 2009, ACM Multimedia.

[22]  Alberto Del Bimbo,et al.  Retrieval of 3D objects by visual similarity , 2004, MIR '04.

[23]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[24]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[25]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[26]  Wei-Ying Ma,et al.  Learning an image manifold for retrieval , 2004, MULTIMEDIA '04.

[27]  Feiping Nie,et al.  Extracting the optimal dimensionality for local tensor discriminant analysis , 2009, Pattern Recognit..

[28]  H. Sebastian Seung,et al.  The Manifold Ways of Perception , 2000, Science.

[29]  Yong Man Ro,et al.  MPEG-7 Texture Descriptors , 2001, Int. J. Image Graph..

[30]  Yi Yang,et al.  Manifold Learning Based Cross-media Retrieval: A Solution to Media Object Complementary Nature , 2007, J. VLSI Signal Process..

[31]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.