Summarizing Entity Descriptions for Effective and Efficient Human-centered Entity Linking

Entity linking connects the Web of documents with knowledge bases. It is the task of linking an entity mention in text to its corresponding entity in a knowledge base. Whereas a large body of work has been devoted to automatically generating candidate entities, or ranking and choosing from them, manual efforts are still needed, e.g., for defining gold-standard links for evaluating automatic approaches, and for improving the quality of links in crowdsourcing approaches. However, structured descriptions of entities in knowledge bases are sometimes very long. To avoid overloading human users with too much information and help them more efficiently choose an entity from candidates, we aim to substitute entire entity descriptions with compact, equally effective structured summaries that are automatically generated. To achieve it, our approach analyzes entity descriptions in the knowledge base and the context of entity mention from multiple perspectives, including characterizing and differentiating power, information overlap, and relevance to context. Extrinsic evaluation (where human users carry out entity linking tasks) and intrinsic evaluation (where human users rate summaries) demonstrate that summaries generated by our approach help human users carry out entity linking tasks more efficiently (22-23% faster), without significantly affecting the quality of links obtained; and our approach outperforms existing approaches to summarizing entity descriptions.

[1]  Massimiliano Ciaramita,et al.  A framework for benchmarking entity-annotation systems , 2013, WWW.

[2]  Ganesh Ramakrishnan,et al.  Collective annotation of Wikipedia entities in web text , 2009, KDD.

[3]  Feng Chu,et al.  An effective GRASP and tabu search for the 0-1 quadratic knapsack problem , 2013, Comput. Oper. Res..

[4]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[5]  Georgios John Fakas A novel keyword search paradigm in relational databases: Object summaries , 2011, Data Knowl. Eng..

[6]  Nikos Mamoulis,et al.  Versatile Size-$l$ Object Summaries for Relational Keyword Search , 2014, IEEE Transactions on Knowledge and Data Engineering.

[7]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[8]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[9]  Jade Goldstein-Stewart,et al.  The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[10]  Stefanos D. Kollias,et al.  A String Metric for Ontology Alignment , 2005, SEMWEB.

[11]  Yuzhong Qu,et al.  Facilitating Human Intervention in Coreference Resolution with Comparative Entity Summaries , 2014, ESWC.

[12]  David Pisinger,et al.  The quadratic knapsack problem - a survey , 2007, Discret. Appl. Math..

[13]  Jiawei Han,et al.  Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions , 2015, IEEE Transactions on Knowledge and Data Engineering.

[14]  Yuzhong Qu,et al.  RELIN: Relatedness and Informativeness-Based Centrality for Entity Summarization , 2011, International Semantic Web Conference.

[15]  Marcin Sydow,et al.  The notion of diversity in graphical entity summarisation on semantic knowledge graphs , 2013, Journal of Intelligent Information Systems.

[16]  Joel Nothman,et al.  Evaluating Entity Linking with Wikipedia , 2013, Artif. Intell..

[17]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[18]  Yi Zhang,et al.  Summarizing highly structured documents for effective search interaction , 2012, SIGIR '12.

[19]  Yuzhong Qu,et al.  Searching Linked Objects with Falcons: Approach, Implementation and Evaluation , 2009, Int. J. Semantic Web Inf. Syst..

[20]  Zhi Cai,et al.  Size-l Object Summaries for Relational Keyword Search , 2011, Proc. VLDB Endow..

[21]  Gianluca Demartini,et al.  ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking , 2012, WWW.