A Simple Theoretical Model of Importance for Summarization

Research on summarization has mainly been driven by empirical approaches, crafting systems to perform well on standard datasets with the notion of information Importance remaining latent. We argue that establishing theoretical models of Importance will advance our understanding of the task and help to further improve summarization systems. To this end, we propose simple but rigorous definitions of several concepts that were previously used only intuitively in summarization: Redundancy, Relevance, and Informativeness. Importance arises as a single quantity naturally unifying these concepts. Additionally, we provide intuitions to interpret the proposed quantities and experiments to demonstrate the potential of the framework to inform and guide subsequent works.

[1]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies , 2000, ArXiv.

[2]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[3]  Luciano Floridi,et al.  Philosophical Conceptions of Information , 2009, Formal Theories of Information.

[4]  Iryna Gurevych,et al.  Objective Function Learning to Match Human Judgements for Optimization-Based Summarization , 2018, NAACL.

[5]  Jure Leskovec,et al.  Impact of Linguistic Analysis on the Semantic Graph Coverage and Learning of Document Extracts , 2005, AAAI.

[6]  Jade Goldstein-Stewart,et al.  The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[7]  Jianfeng Gao,et al.  An Information-Theoretic Approach to Automatic Evaluation of Summaries , 2006, NAACL.

[8]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[9]  Dianne P. O'Leary,et al.  Topic-Focused Multi-Document Summarization Using an Approximate Oracle Score , 2006, ACL.

[10]  Isabelle Tellier,et al.  Exploring Vector Spaces for Semantic Relations , 2017, EMNLP.

[11]  Hui Lin,et al.  A Class of Submodular Functions for Document Summarization , 2011, ACL.

[12]  Shafiq R. Joty,et al.  Improving the Performance of the Random Walk Model for Answering Complex Questions , 2008, ACL.

[13]  Xiaojun Wan,et al.  Recent advances in document summarization , 2017, Knowledge and Information Systems.

[14]  Dilek Z. Hakkani-Tür,et al.  Discovery of Topically Coherent Sentences for Extractive Summarization , 2011, ACL.

[15]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[16]  Chun Chen,et al.  Document Summarization Based on Data Reconstruction , 2012, AAAI.

[17]  Ryan T. McDonald A Study of Global Inference Algorithms in Multi-document Summarization , 2007, ECIR.

[18]  Claude E. Shannon,et al.  Recent Contributions to The Mathematical Theory of Communication , 2009 .

[19]  Ani Nenkova,et al.  The Pyramid Method: Incorporating human content selection variation in summarization evaluation , 2007, TSLP.

[20]  Hang Li,et al.  Reader-Aware Multi-Document Summarization via Sparse Coding , 2015, IJCAI.

[21]  Judith Eckle-Kohler,et al.  A General Optimization Framework for Multi-Document Summarization Using Genetic Algorithms and Swarm Intelligence , 2016, COLING.

[22]  Noah A. Smith,et al.  Extractive Summarization by Maximizing Semantic Volume , 2015, EMNLP.

[23]  Thorsten Joachims,et al.  Large-Margin Learning of Submodular Summarization Models , 2012, EACL.

[24]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[25]  U. Berkeley Exploring Content Models for Multi-Document Summarization , 2018 .

[26]  John M. Conroy,et al.  OCCAMS -- An Optimal Combinatorial Covering Algorithm for Multi-document Summarization , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[27]  Kathleen McKeown,et al.  Content Selection in Deep Learning Models of Summarization , 2018, EMNLP.

[28]  James A. Hendler,et al.  Towards a theory of semantic communication , 2011, 2011 IEEE Network Science Workshop.

[29]  Ani Nenkova,et al.  Automatically Assessing Machine Summary Content Without a Gold Standard , 2013, CL.

[30]  Rudolf Carnap,et al.  An outline of a theory of semantic information , 1952 .

[31]  He Liu,et al.  Multi-Document Summarization Based on Two-Level Sparse Representation Model , 2015, AAAI.

[32]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[33]  Claude E. Shannon,et al.  The Mathematical Theory of Communication , 1950 .

[34]  Ani Nenkova,et al.  Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion , 2007, Information Processing & Management.

[35]  Daniel Marcu,et al.  Bayesian Query-Focused Summarization , 2006, ACL.

[36]  Ani Nenkova,et al.  A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization , 2006, SIGIR.

[37]  Dipanjan Das Andr,et al.  A Survey on Automatic Text Summarization , 2007 .

[38]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[39]  Zhi-Hong Deng,et al.  An Unsupervised Multi-Document Summarization Framework Based on Neural Document Model , 2016, COLING.

[40]  Judith Eckle-Kohler,et al.  Supervised Learning of Automatic Pyramid for Optimization-Based Multi-Document Summarization , 2017, ACL.

[41]  C. Fillmore FRAME SEMANTICS AND THE NATURE OF LANGUAGE * , 1976 .

[42]  Inderjeet Mani,et al.  Multi-Document Summarization by Graph Search and Matching , 1997, AAAI/IAAI.

[43]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[44]  Jason Eisner,et al.  Lexical Semantics , 2020, The Handbook of English Linguistics.

[45]  Magnus Sahlgren,et al.  The Distributional Hypothesis , 2008 .

[46]  Richard Montague,et al.  ENGLISH AS A FORMAL LANGUAGE , 1975 .

[47]  Yi Liu,et al.  Clustering Sentences with Density Peaks for Multi-document Summarization , 2015, NAACL.

[48]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[49]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[50]  Satoru Fujishige,et al.  Submodular functions and optimization , 1991 .

[51]  Katrin Erk,et al.  What Is Word Meaning, Really? (And How Can Distributional Models Help Us Describe It?) , 2010 .

[52]  Johannes Fürnkranz,et al.  Beyond Centrality and Structural Features: Learning Information Importance for Text Summarization , 2016, CoNLL.

[53]  Judith Eckle-Kohler,et al.  A Principled Framework for Evaluating Summarizers: Comparing Models of Summary Quality against Human Judgments , 2017, ACL.

[54]  Daniel Marcu,et al.  A Noisy-Channel Model for Document Compression , 2002, ACL.

[55]  Sanda M. Harabagiu,et al.  Topic themes for multi-document summarization , 2005, SIGIR '05.

[56]  Krys J. Kochut,et al.  Text Summarization Techniques: A Brief Survey , 2017, International Journal of Advanced Computer Science and Applications.

[57]  Yihong Gong,et al.  Multi-Document Summarization using Sentence-based Topic Models , 2009, ACL.

[58]  Regina Barzilay,et al.  Towards Multidocument Summarization by Reformulation: Progress and Prospects , 1999, AAAI/IAAI.

[59]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[60]  E. Maasoumi A compendium to information theory in economics and econometrics , 1993 .

[61]  Enrique Alfonseca,et al.  DualSum: a Topic-Model based approach for update summarization , 2012, EACL.

[62]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[63]  Leonhard Hennig,et al.  Topic-based Multi-Document Summarization with Probabilistic Latent Semantic Analysis , 2009, RANLP.

[64]  David Reitter,et al.  Dimensionality Reduction Aids Term Co-Occurrence Based Multi-Document Summarization , 2006 .

[65]  Yixin Zhong,et al.  A theory of semantic information , 2017, China Communications.

[66]  Victor Lavrenko,et al.  A Generative Theory of Relevance , 2008, The Information Retrieval Series.

[67]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[68]  Mark T. Maybury,et al.  Advances in Automatic Text Summarization , 1999 .

[69]  Dilek Z. Hakkani-Tür,et al.  A Hybrid Hierarchical Model for Multi-Document Summarization , 2010, ACL.

[70]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[71]  Annie Louis A Bayesian Method to Incorporate Background Knowledge during Automatic Text Summarization , 2014, ACL.

[72]  Ani Nenkova,et al.  A Survey of Text Summarization Techniques , 2012, Mining Text Data.

[73]  Ted Dunning,et al.  Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.

[74]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[75]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[76]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[77]  Xiaojun Wan,et al.  Improved Affinity Graph Based Multi-Document Summarization , 2006, NAACL.

[78]  Benoit Favre,et al.  A Scalable Global Model for Summarization , 2009, ILP 2009.

[79]  Hui Lin,et al.  A Repository of State of the Art and Competitive Baseline Summaries for Generic News Summarization , 2014, LREC.

[80]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.