On Conceptual Indexing for Data Summarization

A summary is a comprehensive description that grasps the essence of a subject. A text, a collection of text doc- uments, a query answer can be summarized by simple means such as an automatically generated list of the most frequent words or "advanced" by a meaningful textual description of the subject. In between these two extremes are summaries by means of selected concepts exploiting background knowledge providing selected key concepts. We address in this paper an approach where conceptual summaries are provided through a conceptualization as given by an ontology. The idea is to re- strict a background ontology to the set of concepts that appears in the text to be summarized and thereby provide a structure, a so-called instantiated ontology, that is specific to the domain of the text and can be used to condense to a summary not only quantitatively but also conceptually covers the subject of the text.

[1]  Troels Andreasen,et al.  Domain-Specific Similarity and Retrieval , 2005 .

[2]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[3]  Inderjeet Mani,et al.  The Challenges of Automatic Summarization , 2000, Computer.

[4]  Jørgen Fischer Nilsson A logico-algebraic framework for ontologies , 2001 .

[5]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[6]  Ronald R. Yager,et al.  A Multicriteria Approach to Data Summarization Using Concept Ontologies , 2006, IEEE Transactions on Fuzzy Systems.

[7]  Yang Wang,et al.  Question Answering Summarization of Multiple Biomedical Documents , 2007, Canadian Conference on AI.

[8]  George A. Miller,et al.  Using a Semantic Concordance for Sense Identification , 1994, HLT.

[9]  Troels Andreasen,et al.  Content-based text querying with ontological descriptors , 2004, Data Knowl. Eng..

[10]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[11]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[12]  Troels Andreasen,et al.  On Measuring Similarity for Conceptual Querying , 2002, FQAS.

[13]  Hyoil Han,et al.  Survey of Word Sense Disambiguation Approaches , 2005, FLAIRS Conference.

[14]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[15]  Steven P. Abney Partial parsing via finite-state cascades , 1996, Natural Language Engineering.

[16]  Troels Andreasen,et al.  Conceptual querying through ontologies , 2009, Fuzzy Sets Syst..