Generating Pictorial Storylines Via Minimum-Weight Connected Dominating Set Approximation in Multi-View Graphs

This paper introduces a novel framework for generating pictorial storylines for given topics from text and image data on the Internet. Unlike traditional text summarization and timeline generation systems, the proposed framework combines text and image analysis and delivers a storyline containing textual, pictorial, and structural information to provide a sketch of the topic evolution. A key idea in the framework is the use of an approximate solution for the dominating set problem. Given a collection of topic-related objects consisting of images and their text descriptions, a weighted multi-view graph is first constructed to capture the contextual and temporal relationships among these objects. Then the objects are selected by solving the minimum-weighted connected dominating set problem defined on this graph. Comprehensive experiments on real-world data sets demonstrate the effectiveness of the proposed framework.

[1]  Eduard H. Hovy,et al.  From Single to Multi-document Summarization , 2002, ACL.

[2]  Yan Zhang,et al.  Evolutionary timeline summarization: a balanced optimization framework via iterative substitution , 2011, SIGIR.

[3]  Chin-Yew Lin,et al.  From Single to Multi-document Summarization : A Prototype System and its Evaluation , 2002 .

[4]  Yiming Yang,et al.  A study of retrospective and on-line event detection , 1998, SIGIR '98.

[5]  Ricardo Baeza-Yates,et al.  Effectiveness of Temporal Snippets , 2009 .

[6]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[7]  Tao Li,et al.  Multi-Document Summarization via the Minimum Dominating Set , 2010, COLING.

[8]  Dafna Shahaf,et al.  Connecting the dots between news articles , 2011, IJCAI 2011.

[9]  Rada Mihalcea,et al.  A Language Independent Algorithm for Single and Multiple Document Summarization , 2005, IJCNLP.

[10]  Jie Tang,et al.  Multi-topic Based Query-Oriented Summarization , 2009, SDM.

[11]  Thorsten Brants,et al.  A System for new event detection , 2003, SIGIR.

[12]  Helena Ahonen-Myka,et al.  Simple Semantics in Topic Detection and Tracking , 2004, Information Retrieval.

[13]  Xin Liu,et al.  Generic text summarization using relevance measure and latent semantic analysis , 2001, SIGIR '01.

[14]  Chris H. Q. Ding,et al.  Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization , 2008, SIGIR '08.

[15]  James Allan,et al.  Text classification and named entities for new event detection , 2004, SIGIR '04.

[16]  Dragomir R. Radev,et al.  LexPageRank: Prestige in Multi-Document Text Summarization , 2004, EMNLP.

[17]  Ran Raz,et al.  A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of NP , 1997, STOC '97.

[18]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[20]  Sudipto Guha,et al.  Approximation algorithms for directed Steiner problems , 1999, SODA '98.

[21]  Joshua Goodman,et al.  Multi-Document Summarization by Maximizing Informative Content-Words , 2007, IJCAI.

[22]  Yiannis S. Boutalis,et al.  CEDD: Color and Edge Directivity Descriptor: A Compact Descriptor for Image Indexing and Retrieval , 2008, ICVS.

[23]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.