FACES: Diversity-Aware Entity Summarization Using Incremental Hierarchical Conceptual Clustering

Semantic Web documents that encode facts about entities on the Web have been growing rapidly in size and evolving over time. Creating summaries on lengthy Semantic Web documents for quick identification of the corresponding entity has been of great contemporary interest. In this paper, we explore automatic summarization techniques that characterize and enable identification of an entity and create summaries that are human friendly. Specifically, we highlight the importance of diversified (faceted) summaries by combining three dimensions: diversity, uniqueness, and popularity. Our novel diversity-aware entity summarization approach mimics human conceptual clustering techniques to group facts, and picks representative facts from each group to form concise (i.e., short) and comprehensive (i.e., improved coverage through diversity) summaries. We evaluate our approach against the state-of-the-art techniques and show that our work improves both the quality and the efficiency of entity summarization.

[1]  Pat Langley,et al.  Models of Incremental Concept Formation , 1990, Artif. Intell..

[2]  Ani Nenkova,et al.  A Survey of Text Summarization Techniques , 2012, Mining Text Data.

[3]  Amit P. Sheth,et al.  SemRank: ranking complex relationship search results on the semantic web , 2005, WWW '05.

[4]  Ryutaro Ichise,et al.  Graph-based ontology analysis in the linked open data , 2012, I-SEMANTICS '12.

[5]  Yuzhong Qu,et al.  RELIN: Relatedness and Informativeness-Based Centrality for Entity Summarization , 2011, International Semantic Web Conference.

[6]  Marcin Sydow,et al.  The notion of diversity in graphical entity summarisation on semantic knowledge graphs , 2013, Journal of Intelligent Information Systems.

[7]  Ramayya Krishnan,et al.  Incremental hierarchical clustering of text documents , 2006, CIKM '06.

[8]  Jens Lehmann,et al.  Introduction to Linked Data and Its Lifecycle on the Web , 2013, Reasoning Web.

[9]  Yun Peng,et al.  Finding and Ranking Knowledge on the Semantic Web , 2005, SEMWEB.

[10]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[11]  Karl Aberer,et al.  TRank: Ranking Entity Types Using the Web of Data , 2013, International Semantic Web Conference.

[12]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[13]  Amit P. Sheth,et al.  A statistical and schema independent approach to identify equivalent properties on linked data , 2013, I-SEMANTICS '13.

[14]  Yuzhong Qu,et al.  Facilitating Human Intervention in Coreference Resolution with Comparative Entity Summaries , 2014, ESWC.

[15]  Achim Rettinger,et al.  Browsing DBpedia Entities with Summaries , 2014, ESWC.

[16]  Steffen Staab,et al.  TripleRank: Ranking Semantic Web Data by Tensor Decomposition , 2009, SEMWEB.

[17]  Tao Li,et al.  Document update summarization using incremental hierarchical clustering , 2010, CIKM.

[18]  Harald Sack,et al.  Evaluating Entity Summarization Using a Game-Based Ground Truth , 2012, International Semantic Web Conference.

[19]  Jens Lehmann,et al.  Introduction to Linked Data and Its Lifecycle on the Web , 2013, Reasoning Web.