Cross-media Cross-genre Information Ranking based on Multi-media Information Networks

Current web technology has brought us a scenario that information about a certain topic is widely dispersed in data from different domains and data modalities, such as texts and images from news and social media. Automatic extraction of the most informative and important multimedia summary (e.g. a ranked list of inter-connected texts and images) from massive amounts of cross-media and cross-genre data can significantly save users’ time and effort that is consumed in browsing. In this paper, we propose a novel method to address this new task based on automatically constructed Multi-media Information Networks (MiNets) by incorporating cross-genre knowledge and inferring implicit similarity across texts and images. The facts from MiNets are exploited in a novel random walk-based algorithm to iteratively propagate ranking scores across multiple data modalities. Experimental results demonstrated the effectiveness of our MiNets-based approach and the power of cross-media cross-genre inference.

[1]  Heng Ji,et al.  Joint Event Extraction via Structured Prediction with Global Features , 2013, ACL.

[2]  Zhen Li,et al.  Hierarchical Gaussianization for image classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[3]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[4]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Thomas S. Huang,et al.  Ontological Inference Framework with Joint Ontology Construction and Learning for Image Understanding , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[6]  Fabio Massimo Zanzotto,et al.  Linguistic Redundancy in Twitter , 2011, EMNLP.

[7]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[8]  Shumeet Baluja,et al.  VisualRank: Applying PageRank to Large-Scale Image Search , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[10]  Charu C. Aggarwal,et al.  Transfer Learning of Distance Metrics by Cross-Domain Metric Sampling across Heterogeneous Spaces , 2012, SDM.

[11]  Heng Ji,et al.  Tweet Ranking Based on Heterogeneous Networks , 2012, COLING.

[12]  Yang Song,et al.  Topical Keyphrase Extraction from Twitter , 2011, ACL.

[13]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[14]  Heng Ji,et al.  Collaborative Ranking: A Case Study on Entity Linking , 2011, EMNLP.

[15]  Wayne H. Ward,et al.  Towards Robust Semantic Role Labeling , 2007, CL.

[16]  Yansong Feng,et al.  Topic Models for Image Annotation and Text Illustration , 2010, HLT-NAACL.

[17]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[18]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.