Summarize What You Are Interested In: An Optimization Framework for Interactive Personalized Summarization

Most traditional summarization methods treat their outputs as static and plain texts, which fail to capture user interests during summarization because the generated summaries are the same for different users. However, users have individual preferences on a particular source document collection and obviously a universal summary for all users might not always be satisfactory. Hence we investigate an important and challenging problem in summary generation, i.e., Interactive Personalized Summarization (IPS), which generates summaries in an interactive and personalized manner. Given the source documents, IPS captures user interests by enabling interactive clicks and incorporates personalization by modeling captured reader preference. We develop experimental systems to compare 5 rival algorithms on 4 instinctively different datasets which amount to 5197 documents. Evaluation results in ROUGE metrics indicate the comparable performance between IPS and the best competing system but IPS produces summaries with much more user satisfaction according to evaluator ratings. Besides, low ROUGE consistency among these user preferred summaries indicates the existence of personalization.

[1]  Rada Mihalcea,et al.  A Language Independent Algorithm for Single and Multiple Document Summarization , 2005, IJCNLP.

[2]  Yan Zhang,et al.  Evolutionary timeline summarization: a balanced optimization framework via iterative substitution , 2011, SIGIR.

[3]  Xiaojun Wan,et al.  Multi-document summarization using cluster-based link analysis , 2008, SIGIR '08.

[4]  Xin Liu,et al.  Generic text summarization using relevance measure and latent semantic analysis , 2001, SIGIR '01.

[5]  Anton Leuski,et al.  iNeATS: Interactive Multi-Document Summarization , 2003, ACL.

[6]  Dragomir R. Radev,et al.  LexPageRank: Prestige in Multi-Document Text Summarization , 2004, EMNLP.

[7]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[8]  Yong Yu,et al.  Enhancing diversity, coverage and balance for summarization through structure learning , 2009, WWW '09.

[9]  Xiaojun Wan,et al.  Manifold-Ranking Based Topic-Focused Multi-Document Summarization , 2007, IJCAI.

[10]  Qiang Yang,et al.  Web-page summarization using clickthrough data , 2005, SIGIR '05.

[11]  Yang Song,et al.  Topical Keyphrase Extraction from Twitter , 2011, ACL.

[12]  Eduard H. Hovy,et al.  From Single to Multi-document Summarization , 2002, ACL.

[13]  Chin-Yew Lin,et al.  From Single to Multi-document Summarization : A Prototype System and its Evaluation , 2002 .

[14]  Jade Goldstein-Stewart,et al.  Summarizing text documents: sentence selection and evaluation metrics , 1999, SIGIR '99.

[15]  R RadevDragomir,et al.  Centroid-based summarization of multiple documents , 2004 .

[16]  Eugene Agichtein,et al.  Ready to buy or just browsing?: detecting web searcher goals from interaction data , 2010, SIGIR.

[17]  Xiaojun Wan,et al.  Single Document Summarization with Document Expansion , 2007, AAAI.

[18]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[19]  Stephen Wan,et al.  In-Browser Summarisation: Generating Elaborative Summaries Biased Towards the Reading Context , 2008, ACL.

[20]  Ani Nenkova,et al.  Automatically Evaluating Content Selection in Summarization without Human Models , 2009, EMNLP.

[21]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[22]  James Allan,et al.  Temporal summaries of new topics , 2001, SIGIR '01.

[23]  Yan Zhang,et al.  Event Recognition from News Webpages through Latent Ingredients Extraction , 2010, AIRS.

[24]  James Allan,et al.  Text classification and named entities for new event detection , 2004, SIGIR '04.