Entity type modeling for multi-document summarization : generating descriptive summaries of geo-located entities

In this work we investigate the application of entity type models in extractive multi-document summarization using the automatic caption generation for images of geo-located entities (e.g. Westminster Abbey, Loch Ness, Eiffel Tower) as an application scenario. Entity type models contain sets of patterns aiming to capture the ways the geo-located entities are described in natural language. They are automatically derived from texts about geo-located entities of the same type (e.g. churches, lakes, towers). We collect texts about geo-located entities from Wikipedia because our investigation show that the information humans associate with entity types positively correlates with the information contained in Wikipedia articles about the same entity types. We integrate entity type models into a multi-document summarizer and use them to address the two major tasks in extractive multi-document summarization: sentence scoring and summary composition. We experiment with three different representation methods for entity type models: signature words, n-gram language models and dependency patterns. We first propose that entity type models will improve sentence scoring, i.e. they will help to assign higher scores to sentences which are more relevant to the output summary than to those which are not. Secondly, we claim that summary composition can be improved using entity type models. We follow two different approaches to integrate the entity type models into our multi-document summarizer. In the first approach we use the entity type models in combination with existing standard summarization features to score the sentences. We also manually categorize the set of patterns by the information types they describe and use them to reduce redundancy and to produce better flow within the summary. The second approach aims to eliminate the need for manual intervention and to fully automate the process of summary generation. As in the first approach the sentences are scored using standard summarization features and entity type models. However, unlike the first approach we fully automate the process of summary composition by simultaneously addressing the redundancy and flow aspects of the summary. We evaluate the summarizer with integrated entity type models relative to (1) a summarizer using standard text related features commonly used in summarization and (2) the Wikipedia location descriptions. The latter constitute a strong baseline for automated summaries to be evaluated against. The automated summaries are evaluated against human reference summaries using ROUGE and human readability evaluation, as is a common practice in automatic summarization. Our results show that entity type models significantly improve the quality of output summaries over that of summaries generated using standard summarization features andWikipedia baseline summaries. The representation of entity type models using dependency patterns is superior to the representations using signature words and n-gram language models.

[1]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[2]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[4]  Hoa Trang Dang,et al.  Evaluation of Automatic Summaries: Metrics under Varying Data Conditions , 2009 .

[5]  Kalina Bontcheva,et al.  Robust Generic and Query-based Summarization , 2003, EACL.

[6]  Jihoon Yang,et al.  Extracting sentence segments for text summarization: a machine learning approach , 2000, SIGIR '00.

[7]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[8]  Gabriella Kazai,et al.  In Search of Quality in Crowdsourcing for Search Engine Evaluation , 2011, ECIR.

[9]  Michael E. Lesk,et al.  Computer Evaluation of Indexing and Text Processing , 1968, JACM.

[10]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[11]  Elizabeth Du,et al.  The discourse-level structure of empirical abstracts: an exploratory study , 1991, Inf. Process. Manag..

[12]  Erhard W. Hinrichs,et al.  Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1 , 2003 .

[13]  Liang Zhou,et al.  A Web-Trained Extraction Summarization System , 2003, NAACL.

[14]  Isaac Siwale ON GLOBAL OPTIMIZATION , 2015 .

[15]  Hui Lin,et al.  Multi-document Summarization via Budgeted Maximization of Submodular Functions , 2010, NAACL.

[16]  Julia Hirschberg,et al.  An Unsupervised Approach to Biography Production Using Wikipedia , 2008, ACL.

[17]  Yong Yu,et al.  Enhancing diversity, coverage and balance for summarization through structure learning , 2009, WWW '09.

[18]  Ani Nenkova,et al.  The Pyramid Method: Incorporating human content selection variation in summarization evaluation , 2007, TSLP.

[19]  Mark Sanderson,et al.  Investigating Summarization Techniques for Geo-Tagged Image Indexing , 2012, ECIR.

[20]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[21]  Marilyn Domas White,et al.  A taxonomy of relationships between images and text , 2003, J. Documentation.

[22]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[23]  Yansong Feng,et al.  How Many Words Is a Picture Worth? Automatic Caption Generation for News Images , 2010, ACL.

[24]  Kenton O'Hara,et al.  Social Impact , 2019, Encyclopedia of Food and Agricultural Ethics.

[25]  David M. Mark,et al.  Geographical categories: an ontological investigation , 2001, Int. J. Geogr. Inf. Sci..

[26]  Ani Nenkova,et al.  Automatic Summarization , 2011, ACL.

[27]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[28]  Chin-Yew Lin,et al.  From Single to Multi-document Summarization : A Prototype System and its Evaluation , 2002 .

[29]  Ani Nenkova,et al.  Automatic Summary Evaluation without Human Models , 2008, TAC.

[30]  Michael Kaisser,et al.  Creating a Research Collection of Question Answer Sentence Pairs with Amazon's Mechanical Turk , 2008, LREC.

[31]  Danushka Bollegala,et al.  A preference learning approach to sentence ordering for multi-document summarization , 2012, Inf. Sci..

[32]  Ryan T. McDonald A Study of Global Inference Algorithms in Multi-document Summarization , 2007, ECIR.

[33]  Graeme Hirst,et al.  Lexical Cohesion Computed by Thesaural relations as an indicator of the structure of text , 1991, CL.

[34]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[35]  Eduard H. Hovy,et al.  The Automated Acquisition of Topic Signatures for Text Summarization , 2000, COLING.

[36]  Razvan C. Bunescu,et al.  A Shortest Path Dependency Kernel for Relation Extraction , 2005, HLT.

[37]  W. Bruce Croft,et al.  A general language model for information retrieval , 1999, CIKM '99.

[38]  Jaime Carbonell,et al.  Multi-Document Summarization By Sentence Extraction , 2000 .

[39]  Massih-Reza Amini,et al.  Incorporating prior knowledge into a transductive ranking algorithm for multi-document summarization , 2009, SIGIR.

[40]  Ahmet Aker,et al.  Evaluating automatically generated user-focused multi-document summaries for geo-referenced images , 2008, COLING 2008.

[41]  J. Ginzburg,et al.  Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue , 2012 .

[42]  Lisa F. Rau,et al.  Automatic Condensation of Electronic Publications by Sentence Selection , 1995, Inf. Process. Manag..

[43]  Hans van Halteren,et al.  Evaluating Information Content by Factoid Analysis: Human annotation and stability , 2004, EMNLP.

[44]  Simone Teufel,et al.  The Structure of Scientific Articles - Applications to Citation Indexing and Summarization , 2010, CSLI Studies in Computational Linguistics.

[45]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[46]  Scott Weinstein,et al.  Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[47]  Maria das Graças Volpe Nunes,et al.  A comprehensive comparative evaluation of RST-based summarization methods , 2010, TSLP.

[48]  Jens Lehmann,et al.  What Have Innsbruck and Leipzig in Common? Extracting Semantics from Wiki Content , 2007, ESWC.

[49]  Dekang Lin,et al.  Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1 , 2011 .

[50]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[51]  Elena Lloret,et al.  Do humans have conceptual models about geographic objects? A user study , 2013, J. Assoc. Inf. Sci. Technol..

[52]  Hong Peng,et al.  Improving diversity in Web search results re-ranking using absorbing random walks , 2010, 2010 International Conference on Machine Learning and Cybernetics.

[53]  Ralph Grishman,et al.  Automatic Acquisition of Domain Knowledge for Information Extraction , 2000, COLING.

[54]  Yansong Feng,et al.  Automatic Image Annotation Using Auxiliary Text Information , 2008, ACL.

[55]  Hoa Trang Dang,et al.  Overview of DUC 2006 , 2006 .

[56]  Duncan J. Watts,et al.  Financial incentives and the "performance of crowds" , 2009, HCOMP '09.

[57]  Regina Barzilay,et al.  Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization , 2004, NAACL.

[58]  Mathieu Roche,et al.  Automatic titling of Articles Using Position and Statistical Information , 2011, RANLP.

[59]  Qin Lu,et al.  Applying regression models to query-focused multi-document summarization , 2011, Inf. Process. Manag..

[60]  Inderjeet Mani,et al.  Multi-Document Summarization by Graph Search and Matching , 1997, AAAI/IAAI.

[61]  Ahmet Aker,et al.  Summary Generation for Toponym-referenced Images using Object Type Language Models , 2009, RANLP.

[62]  Pascale Fung,et al.  One story, one flow: Hidden Markov Story Models for multilingual multidocument summarization , 2006, TSLP.

[63]  Regina Barzilay,et al.  Inferring Strategies for Sentence Ordering in Multidocument News Summarization , 2002, J. Artif. Intell. Res..

[64]  Udo Kruschwitz,et al.  Assessing Crowdsourcing Quality through Objective Tasks , 2012, LREC.

[65]  Carina Silberer,et al.  Building a Multilingual Lexical Resource for Named Entity Disambiguation, Translation and Transliteration , 2008, LREC.

[66]  Brian C. O'Connor,et al.  Modelling what users see when they look at images: a cognitive viewpoint , 2002, J. Documentation.

[67]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[68]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[69]  Eduard H. Hovy,et al.  Planning Coherent Multisentential Text , 1988, ACL.

[70]  Horacio Saggion,et al.  SUMMA. A Robust and Adaptable Summarization Tool , 2008, TAL.

[71]  I. Cicekli,et al.  TurKeyX: Turkish keyphrase extractor , 2008, 2008 23rd International Symposium on Computer and Information Sciences.

[72]  Ahmet Aker,et al.  Understanding the types of information humans associate with geographic objects , 2011, CIKM '11.

[73]  Ralph Grishman,et al.  Summarization System Integrated with Named Entity Tagging and IE pattern Discovery , 2002, LREC.

[74]  Hsinchun Chen,et al.  An algorithmic approach to concept exploration in a large knowledge network (automatic thesaurus consultation): symbolic branch-and-bound search vs. connectionist Hopfield net activation , 1995 .

[75]  Ted Pedersen,et al.  Extended Gloss Overlaps as a Measure of Semantic Relatedness , 2003, IJCAI.

[76]  Hironobu Takahashi,et al.  Automatic word assignment to images based on image division and vector quantization , 2000 .

[77]  Ani Nenkova,et al.  A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization , 2006, SIGIR.

[78]  Therese Firmin Hand,et al.  A Proposal for Task-based Evaluation of Text Summarization Systems , 1997, Workshop On Intelligent Scalable Text Summarization.

[79]  Naomie Salim,et al.  for Multi-Document Summarization , 2008 .

[80]  Sara Shatford,et al.  Analyzing the Subject of a Picture: A Theoretical Approach , 1986 .

[81]  Aimo A. Törn,et al.  Global Optimization , 1999, Science.

[82]  Panagiotis G. Ipeirotis,et al.  Automatic Extraction of Useful Facet Hierarchies from Text Databases , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[83]  Yejin Choi,et al.  Composing Simple Image Descriptions using Web-scale N-grams , 2011, CoNLL.

[84]  Simone Teufel,et al.  Sentence extraction as a classification task , 1997 .

[85]  H. Chen,et al.  An Algorithmic Approach to Concept Exploration in a Large Knowledge Network (Automatic Thesaurus Consultation): Symbolic Branch-and-Bound Search vs. Connectionist Hopfield Net Activation , 1995, J. Am. Soc. Inf. Sci..

[86]  Oliver Lemon,et al.  Integrating Location, Visibility, and Question-Answering in a Spoken Dialogue System for Pedestrian City Exploration , 2012, SIGDIAL Conference.

[87]  Elena Lloret,et al.  A Text Summarization Approach under the Influence of Textual Entailment , 2016, NLPCS.

[88]  Jean Carletta,et al.  Extractive summarization of meeting recordings , 2005, INTERSPEECH.

[89]  Wai Lam,et al.  Developing Infrastructure for the Evaluation of Single and Multi-document Summarization Systems in a Cross-lingual Environment , 2002, LREC.

[90]  Satoshi Sekine,et al.  On-Demand Information Extraction , 2006, ACL.

[91]  Jerry R. Hobbs Proceedings of the 26th annual meeting on Association for Computational Linguistics , 1988, ACL 1988.

[92]  Enrico Motta,et al.  SCARLET: SemantiC RelAtion DiscoveRy by Harvesting OnLinE OnTologies , 2008, ESWC.

[93]  Daniel S. Weld,et al.  Autonomously semantifying wikipedia , 2007, CIKM '07.

[94]  Julia Hirschberg,et al.  Do Summaries Help? A Task-Based Evaluation of Multi-Document Summarization , 2005 .

[95]  Carole A. Goble,et al.  Learning domain ontologies for semantic Web service descriptions , 2005, J. Web Semant..

[96]  Simone Teufel,et al.  Corpora for the Conceptualisation and Zoning of Scientific Papers , 2010, LREC.

[97]  Corinne Jörgensen,et al.  Attributes of Images in Describing Tasks , 1998, Inf. Process. Manag..

[98]  J. Eakins Techniques for image retrieval , 1998 .

[99]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[100]  Chris D. Paice,et al.  The identification of important concepts in highly structured technical papers , 1993, SIGIR.

[101]  Evangelos Kanoulas,et al.  A light way to collect comparable corpora from the Web , 2012, LREC.

[102]  Udo Kruschwitz,et al.  Using Mechanical Turk to Create a Corpus of Arabic Summaries , 2010 .

[103]  Edie M. Rasmussen,et al.  Users' relevance criteria in image retrieval in American history , 2002, Inf. Process. Manag..

[104]  Constantin Orăsan,et al.  An Evolutionary Approach for Improving the Quality of Automatic Summaries , 2003, Proceedings of the ACL 2003 workshop on Multilingual summarization and question answering -.

[105]  and software — performance evaluation , .

[106]  Eduard H. Hovy,et al.  From Single to Multi-document Summarization , 2002, ACL.

[107]  Xiaojin Zhu,et al.  Improving Diversity in Ranking using Absorbing Random Walks , 2007, NAACL.

[108]  Mirella Lapata,et al.  Probabilistic Text Structuring: Experiments with Sentence Ordering , 2003, ACL.

[109]  Daniel Marcu,et al.  Discourse Generation Using Utility-Trained Coherence Models , 2006, ACL.

[110]  K. Baker,et al.  Singular Value Decomposition Tutorial , 2013 .

[111]  Dan Klein,et al.  Jointly Learning to Extract and Compress , 2011, ACL.

[112]  Marc Moens,et al.  Articles Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status , 2002, CL.

[113]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[114]  Ahmet Aker,et al.  Multi-Document Summarization Using A* Search and Discriminative Learning , 2010, EMNLP.

[115]  Mark T. Keane,et al.  Cognitive Psychology: A Student's Handbook , 1990 .

[116]  Sivaji Bandyopadhyay,et al.  Coling 2008: Proceedings of the workshop Multi-source Multilingual Information Extraction and Summarization , 2008, COLING 2008.

[117]  Steffen Hölldobler,et al.  FluCaP: A Heuristic Search Planner for First-Order MDPs , 2006, J. Artif. Intell. Res..

[118]  Chris Buckley,et al.  Automatic Text Summarization by Paragraph Extraction , 1997 .

[119]  Jörg Tiedemann,et al.  Using Syntactic Knowledge for QA , 2006, CLEF.

[120]  Changhu Wang,et al.  Learning query-biased web page summarization , 2007, CIKM '07.

[121]  Ferda Nur Alpaslan,et al.  Text Summarization of Turkish Texts using Latent Semantic Analysis , 2010, COLING.

[122]  Hwee Tou Ng,et al.  Automatically Evaluating Text Coherence Using Discourse Relations , 2011, ACL.

[123]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[124]  Ying Zhang,et al.  Domain-Specific Query Translation for Multilingual Information Access using Machine Translation Augmented With Dictionaries Mined from Wikipedia , 2008, IJCNLP.

[125]  Carlo Strapparava,et al.  Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[126]  Corinne Jörgensen,et al.  Indexing Images: Testing an Image Description Template. , 1996 .

[127]  Journal of the Association for Computing Machinery , 1961, Nature.

[128]  Tapas Kanungo,et al.  Machine Learned Sentence Selection Strategies for Query-Biased Summarization , 2008 .

[129]  Xin Liu,et al.  Generic text summarization using relevance measure and latent semantic analysis , 2001, SIGIR '01.

[130]  Mathias Kirsten,et al.  Proceedings of the Ninth Conference on European Chapter of the Association for Computational Linguistics , 1999 .

[131]  Horacio Saggion A Robust and Adaptable Summarization Tool , 2008 .

[132]  Eduard H. Hovy,et al.  Identifying Topics by Position , 1997, ANLP.

[133]  Massih-Reza Amini,et al.  Automatic Text Summarization Based on Word-Clusters and Ranking Algorithms , 2005, ECIR.

[134]  Anton Hägerstrand Multi Document Summarization. , 2011 .

[135]  Nathan Schneider,et al.  Association for Computational Linguistics: Human Language Technologies , 2011 .

[136]  Cyrus Rashtchian,et al.  Every Picture Tells a Story: Generating Sentences from Images , 2010, ECCV.

[137]  Dilek Z. Hakkani-Tür,et al.  A global optimization framework for meeting summarization , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[138]  J. Oberlander,et al.  Proceedings of the COLING/ACL on Main Conference Poster Sessions , 2006 .

[139]  Kathleen R. McKeown,et al.  Generating natural language summaries from multiple on-line sources , 1998 .

[140]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[141]  Ahmet Aker,et al.  Redundancy reduction for multi-document summaries using A* search and discriminative training , 2012, E-LKR.

[142]  Benoit Favre,et al.  A Scalable Global Model for Summarization , 2009, ILP 2009.

[143]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[144]  Ani Nenkova,et al.  A Coherence Model Based on Syntactic Patterns , 2012, EMNLP.

[145]  Ralph Grishman,et al.  Automatic Pattern Acquisition for Japanese Information Extraction , 2001, HLT.

[146]  Dilek Z. Hakkani-Tür,et al.  Packing the meeting summarization knapsack , 2008, INTERSPEECH.

[147]  Barbara Plank,et al.  Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies , 2011 .

[148]  Arnaud Sahuguet,et al.  Building intelligent Web applications using lightweight wrappers , 2001, Data Knowl. Eng..

[149]  Donghui Feng,et al.  Acquiring High Quality Non-Expert Knowledge from On-Demand Workforce , 2009, PWNLP@IJCNLP.

[150]  Rajeev Sangal,et al.  Proceedings of the 20th international joint conference on Artifical intelligence , 2007 .

[151]  Antal van den Bosch,et al.  Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics , 2007 .

[152]  Horacio Saggion,et al.  Multi-document summarization by cluster/prole relevance and redundancy removal , 2004 .

[153]  Gerard Salton,et al.  Automatic Text Structuring and Summarization , 1997, Inf. Process. Manag..

[154]  Rico Sennrich,et al.  Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics , 2012 .

[155]  Hoa Trang Dang,et al.  Overview of DUC 2005 , 2005 .

[156]  Horacio Saggion,et al.  Topic-based Summarization at DUC 2005 , 2005 .

[157]  Martin Tomko,et al.  Automatic image captioning from the web for GPS photographs , 2010, MIR '10.

[158]  Dragomir R. Radev,et al.  Generating summaries of multiple news articles , 1995, SIGIR '95.

[159]  Ahmet Aker,et al.  Model Summaries for Location-related Images , 2010, LREC.

[160]  Daniel Marcu,et al.  The rhetorical parsing, summarization, and generation of natural language texts , 1998 .

[161]  Eduard Hovy,et al.  NEATS: A Multidocument Summarizer , 2001 .

[162]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[163]  Mark Stevenson,et al.  Dependency Pattern Models for Information Extraction , 2009 .

[164]  Karen Sparck Jones,et al.  Book Reviews: Evaluating Natural Language Processing Systems: An Analysis and Review , 1996, CL.

[165]  Brian Roark,et al.  Query-focused summarization by supervised sentence ranking and skewed word distributions , 2006 .

[166]  Yiannis Aloimonos,et al.  Corpus-Guided Sentence Generation of Natural Images , 2011, EMNLP.

[167]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[168]  Yin Yang,et al.  Query by document , 2009, WSDM '09.

[169]  Vasileios Hatzivassiloglou,et al.  Automatic Creation of Domain Templates , 2006, ACL.

[170]  Tat-Seng Chua,et al.  NUS at DUC 2005: Understanding Documents via Concept Links , 2005 .

[171]  Daniel X. Le,et al.  Identification of comment-on sentences in online biomedical documents using support vector machines , 2007, Electronic Imaging.

[172]  Julia Hirschberg,et al.  An Unsupervised Approach to Biography Production Using Wikipedia , 2008, ACL 2008.

[173]  Phyllis B. Baxendale,et al.  Machine-Made Index for Technical Literature - An Experiment , 1958, IBM J. Res. Dev..

[174]  Atefeh Farzindar,et al.  CATS a topic-oriented multi-document summarization system at DUC 2005 , 2005 .

[175]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[176]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[177]  Kevin Knight Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics , 2005 .

[178]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[179]  Thierry Poibeau,et al.  Multi-source, Multilingual Information Extraction and Summarization , 2012, Theory and Applications of Natural Language Processing.

[180]  M. Walker,et al.  Centering Theory in Discourse , 1998 .

[181]  J. Steinberger,et al.  Using Latent Semantic Analysis in Text Summarization and Summary Evaluation , 2004 .

[182]  Ahmet Aker,et al.  Generating Image Descriptions Using Dependency Relational Patterns , 2010, ACL.

[183]  Eleanor Rosch,et al.  Principles of Categorization , 1978 .

[184]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[185]  Ellen M. Voorhees,et al.  Overview of the TREC 2004 Novelty Track. , 2005 .

[186]  Niranjan Balasubramanian,et al.  Analysis of User Image Descriptions and Automatic Image Indexing Vocabularies: An Exploratory Study , 2004 .

[187]  Mari Ostendorf,et al.  Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1 , 2003 .

[188]  Inderjeet Mani,et al.  Machine Learning of Generic and User-Focused Summarization , 1998, AAAI/IAAI.

[189]  F. Rudzicz Human Language Technologies : The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics , 2010 .

[190]  Gerhard Weikum,et al.  YAGO: A Large Ontology from Wikipedia and WordNet , 2008, J. Web Semant..

[191]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[192]  Iryna Gurevych,et al.  Semantic Similarity Applied to Spoken Dialogue Summarization , 2004, COLING.

[193]  Oren Etzioni,et al.  The Tradeoffs Between Open and Traditional Relation Extraction , 2008, ACL.

[194]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[195]  T. Miyamoto,et al.  Recognition and Textual Description of Human Activities by Mobile Robot , 2008, 2008 3rd International Conference on Innovative Computing Information and Control.

[196]  Antonio Zamora,et al.  Automatic Abstracting Research at Chemical Abstracts Service , 1975, J. Chem. Inf. Comput. Sci..

[197]  Eduard H. Hovy,et al.  Automated Text Summarization and the SUMMARIST System , 1998, TIPSTER.

[198]  Inderjeet Mani,et al.  The Tipster Summac Text Summarization Evaluation , 1999, EACL.

[199]  Marti A. Hearst Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[200]  Marie-Francine Moens,et al.  Text Analysis for Automatic Image Annotation , 2007, ACL.

[201]  Ross S. Purves,et al.  Describing the where – improving image annotation and search through geography , 2008 .

[202]  Micha Elsner,et al.  Extending the Entity Grid with Entity-Specific Features , 2011, ACL.

[203]  Shih-Fu Chang,et al.  Conceptual framework for indexing visual information at multiple levels , 1999, Electronic Imaging.

[204]  Danushka Bollegala,et al.  A Bottom-Up Approach to Sentence Ordering for Multi-Document Summarization , 2006, ACL.

[205]  Enrique Alfonseca,et al.  Generating Extracts with Genetic Algorithms , 2003, ECIR.

[206]  Chris D. Paice,et al.  The automatic generation of literature abstracts: an approach based on the identification of self-indicating phrases , 1980, SIGIR '80.

[207]  Regina Barzilay,et al.  Towards Multidocument Summarization by Reformulation: Progress and Prospects , 1999, AAAI/IAAI.

[208]  Ahmet Aker,et al.  STARLET: Multi-document Summarization of Service and Product Reviews with Balanced Rating Distributions , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[209]  Ahmet Aker,et al.  Multi-document summarization using A * search and discriminative training , 2013 .

[210]  Xiaoli Li,et al.  Eliminating noisy information in Web pages for data mining , 2003, KDD '03.

[211]  No Value,et al.  Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics , 2000 .

[212]  Simone Paolo Ponzetto,et al.  Deriving a Large-Scale Taxonomy from Wikipedia , 2007, AAAI.

[213]  Aron Culotta,et al.  Dependency Tree Kernels for Relation Extraction , 2004, ACL.

[214]  Iryna Gurevych,et al.  Proceedings of the 2009 Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources , 2009 .

[215]  Christos Faloutsos,et al.  Automatic image captioning , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[216]  Dragomir R. Radev,et al.  Experiments in Single and Multi-Document Summarization Using MEAD , 2001 .

[217]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[218]  Donghong Ji,et al.  Genetic algorithm based multi-document summarization , 2006 .

[219]  Yansong Feng,et al.  Topic Models for Image Annotation and Text Illustration , 2010, HLT-NAACL.

[220]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[221]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[222]  Liang Lin,et al.  I2T: Image Parsing to Text Description , 2010, Proceedings of the IEEE.

[223]  Ilyas Cicekli,et al.  Generic text summarization for Turkish , 2009, 2009 24th International Symposium on Computer and Information Sciences.

[224]  Markus Dräger,et al.  Generation of landmark-based navigation instructions from open-source data , 2012, EACL.

[225]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[226]  R. Catrambone,et al.  Proceedings of the 32nd Annual Conference of the Cognitive Science Society , 2010 .

[227]  Yuji Matsumoto,et al.  Extracting Important Sentences with Support Vector Machines , 2002, COLING.

[228]  Mirella Lapata,et al.  Modeling Local Coherence: An Entity-Based Approach , 2005, ACL.

[229]  Michael Halliday,et al.  Cohesion in English , 1976 .

[230]  Julio Gonzalo,et al.  An Empirical Study of Information Synthesis Task , 2004, ACL.

[231]  Max J. Egenhofer,et al.  Determining Semantic Similarity among Entity Classes from Different Ontologies , 2003, IEEE Trans. Knowl. Data Eng..

[232]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[233]  Jan Hajic,et al.  Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2 , 2003 .

[234]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[235]  Michael Gamon,et al.  The PYTHY Summarization System: Microsoft Research at DUC 2007 , 2007 .

[236]  D. McNamara,et al.  Cohesion, coherence, and expert evaluations of writing proficiency , 2010 .

[237]  Mirella Lapata,et al.  Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics , 1999, ACL 1999.

[238]  Daniel Marcu,et al.  From discourse structures to text summaries , 1997 .

[239]  Dekang Lin,et al.  Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2 , 2011 .

[240]  Chuleerat Jaruskulchai,et al.  Generic text summarization using local and global properties of sentences , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[241]  Peter G. B. Enser,et al.  Analysis of user need in image archives , 1997, J. Inf. Sci..

[242]  Ted Dunning,et al.  Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.

[243]  Ralph Grishman,et al.  An Improved Extraction Pattern Representation Model for Automatic IE Pattern Acquisition , 2003, ACL.

[244]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[245]  Mark Sanderson,et al.  Advantages of query biased summaries in information retrieval , 1998, SIGIR '98.

[246]  Mark Stevenson,et al.  A Semantic Approach to IE Pattern Induction , 2005, ACL.

[247]  David A. Forsyth,et al.  Learning the semantics of words and pictures , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[248]  Gideon S. Mann Fine-Grained Proper Noun Ontologies for Question Answering , 2002, COLING 2002.

[249]  Ani Nenkova,et al.  Evaluating Content Selection in Summarization: The Pyramid Method , 2004, NAACL.

[250]  SaltonGerard,et al.  Term-weighting approaches in automatic text retrieval , 1988 .

[251]  Elena Lloret,et al.  Multi-Document Summarization Techniques for Generating Image Descriptions: A Comparative Analysis , 2013, Multi-source, Multilingual Information Extraction and Summarization.

[252]  Richard M. Schwartz,et al.  A Sentence-Trimming Approach to Multi-Document Summarization , 2005 .