Linked Data Entity Summarization

On the Web, the amount of structured and Linked Data about entities is constantly growing. Descriptions of single entities often include thousands of statements and it becomes difficult to comprehend the data, unless a selection of the most relevant facts is provided. This doctoral thesis addresses the problem of Linked Data entity summarization. The contributions involve two entity summarization approaches, a common API for entity summarization, and an approach for entity data fusion.

[1]  Zhi Cai,et al.  Ranking of Object Summaries , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[2]  Heiko Paulheim,et al.  Evaluating Entity Linking: An Analysis of Current Benchmark Datasets and a Roadmap for Doing a Better Job , 2016, LREC.

[3]  Jun Rao,et al.  Dynamic faceted search for discovery-driven analysis , 2008, CIKM '08.

[4]  Markus Zanker,et al.  Linked open data to support content-based recommender systems , 2012, I-SEMANTICS '12.

[5]  Marti A. Hearst,et al.  Finding the flow in web site search , 2002, CACM.

[6]  Enrico Motta,et al.  Evaluating question answering over linked data , 2013, J. Web Semant..

[7]  Vagelis Hristidis,et al.  ObjectRank: Authority-Based Keyword Search in Databases , 2004, VLDB.

[8]  Steffen Lohmann,et al.  Interactive Relationship Discovery via the Semantic Web , 2010, ESWC.

[9]  Steffen Stadtmüller,et al.  Data-Fu: a language and an interpreter for interaction with read/write linked data , 2013, WWW.

[10]  Ora Lassila,et al.  WEB METADATA : A Matter of Semantics , 1998 .

[11]  Rik Van de Walle,et al.  Efficient runtime service discovery and consumption with hyperlinked RESTdesc , 2011, 2011 7th International Conference on Next Generation Web Services Practices.

[12]  S. Decker,et al.  Using Naming Authority to Rank Data and Ontologies for Web Search , 2009, SEMWEB.

[13]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[14]  Enrico Motta,et al.  AQUA: An Ontology-Driven Question Answering System , 2003, New Directions in Question Answering.

[15]  Divesh Srivastava,et al.  Record linkage: similarity measures and algorithms , 2006, SIGMOD Conference.

[16]  S. Bacchini The Encyclopedia of Applied Linguistics , 2014 .

[17]  Georgios John Fakas Automated generation of object summaries from relational databases: A novel keyword searching paradigm , 2008, 2008 IEEE 24th International Conference on Data Engineering Workshop.

[18]  Bu-Sung Lee,et al.  From Linked Data to Relevant Data -- Time is the Essence , 2011, ArXiv.

[19]  Fabien Gandon,et al.  Explanation in the Semantic Web: a survey of the state of the art , 2012 .

[20]  Elena Lloret,et al.  Text summarisation in progress: a literature review , 2011, Artificial Intelligence Review.

[21]  Christian Bizer,et al.  The WebDataCommons Microdata, RDFa and Microformat Dataset Series , 2014, International Semantic Web Conference.

[22]  Marcin Sydow,et al.  DIVERSUM: Towards diversified summarisation of entities in knowledge graphs , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).

[23]  David R. Karger,et al.  Fresnel: A Browser-Independent Presentation Vocabulary for RDF , 2005, SEMWEB.

[24]  Mark A. Musen,et al.  The Knowledge Model of Protégé-2000: Combining Interoperability and Flexibility , 2000, EKAW.

[25]  Achim Rettinger,et al.  Browsing DBpedia Entities with Summaries , 2014, ESWC.

[26]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[27]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[28]  Yuzhong Qu,et al.  Facilitating Human Intervention in Coreference Resolution with Comparative Entity Summaries , 2014, ESWC.

[29]  Ioan Toma,et al.  Leveraging Usage Data for Linked Data Movie Entity Summarization , 2012, ArXiv.

[30]  Kevin Li,et al.  Faceted metadata for image search and browsing , 2003, CHI '03.

[31]  Huajun Chen,et al.  The Semantic Web , 2011, Lecture Notes in Computer Science.

[32]  Marko Grobelnik,et al.  Question Answering Based on Semantic Graphs , 2009 .

[33]  Eyal Oren,et al.  Simple Algorithms for Predicate Suggestions Using Similarity and Co-occurrence , 2007, ESWC.

[34]  Harald Sack,et al.  Evaluating Entity Summarization Using a Game-Based Ground Truth , 2012, International Semantic Web Conference.

[35]  Marcin Sydow,et al.  The notion of diversity in graphical entity summarisation on semantic knowledge graphs , 2013, Journal of Intelligent Information Systems.

[36]  Lee Spector,et al.  Ontology-Based Knowledge Discovery on the World-Wide Web , 1996 .

[37]  Felix Sasaki,et al.  Internationalization Tag Set (ITS) Version 2.0 , 2013 .

[38]  Simon A. Dobson,et al.  Lightweight Databases , 1995, Comput. Networks ISDN Syst..

[39]  Amit P. Sheth,et al.  Gleaning Types for Literals in RDF Triples with Application to Entity Summarization , 2016, ESWC.

[40]  Ramanathan V. Guha,et al.  Semantic search , 2003, WWW '03.

[41]  Jürgen Umbrich,et al.  An empirical survey of Linked Data conformance , 2012, J. Web Semant..

[42]  Marcin Sydow,et al.  Entity summarisation with limited edge budget on knowledge graphs , 2010, Proceedings of the International Multiconference on Computer Science and Information Technology.

[43]  Andreas Harth,et al.  A language-independent method for the extraction of RDF verbalization templates , 2014, INLG.

[44]  Swapna Kulkarni,et al.  A Recommendation Engine Using Apache Spark , 2015 .

[45]  Fabien L. Gandon,et al.  Survey of Linked Data Based Exploration Systems , 2014, IESD@ISWC.

[46]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[47]  Kevin Chen-Chuan Chang,et al.  EntityRank: Searching Entities Directly and Holistically , 2007, VLDB.

[48]  Stefan Decker,et al.  Sig.ma: live views on the web of data , 2010, WWW '10.

[49]  Steffen Staab,et al.  TripleRank: Ranking Semantic Web Data by Tensor Decomposition , 2009, SEMWEB.

[50]  Achim Rettinger,et al.  ELES: Combining Entity Linking and Entity Summarization , 2016, ICWE.

[51]  Carl von Linné Systema Naturae: Per Regna Tria Naturae, Secundum Classes, Ordines, Genera, Species, Cum Characteribus, Differentiis, Synonymis, Locis, , 2011 .

[52]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[53]  Enrico Motta,et al.  Identifying Key Concepts in an Ontology, through the Integration of Cognitive Principles with Statistical and Topological Measures , 2008, ASWC.

[54]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[55]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[56]  Harald Sack,et al.  RISQ! Renowned Individuals Semantic Quiz: a Jeopardy like quiz game for ranking facts , 2011, I-Semantics '11.

[57]  Zhi Cai,et al.  Size-l Object Summaries for Relational Keyword Search , 2011, Proc. VLDB Endow..

[58]  Dunja Mladenic,et al.  User Modeling Combining Access Logs, Page Content and Semantics , 2011, ArXiv.

[59]  Divesh Srivastava,et al.  Truth Discovery and Copying Detection in a Dynamic World , 2009, Proc. VLDB Endow..

[60]  Sreenivas Gollapudi,et al.  Diversifying search results , 2009, WSDM '09.

[61]  Marko Grobelnik,et al.  SemSearch'11: the 4th semantic search workshop , 2011, WWW.

[62]  Andreas Hotho,et al.  Conceptual User Tracking , 2003, AWIC.

[63]  Eyal Oren,et al.  Extending Faceted Navigation for RDF Data , 2006, SEMWEB.

[64]  N. F. Noy,et al.  Ontology Development 101: A Guide to Creating Your First Ontology , 2001 .

[65]  Aba-Sah Dadzie,et al.  Approaches to visualising Linked Data: A survey , 2011, Semantic Web.

[66]  Heather Ford,et al.  Semantic Cities: Coded Geopolitics and the Rise of the Semantic Web , 2015 .

[67]  Yun Peng,et al.  Finding and Ranking Knowledge on the Semantic Web , 2005, SEMWEB.

[68]  Dieter Fensel,et al.  Knowledge Engineering: Principles and Methods , 1998, Data Knowl. Eng..

[69]  Enrico Motta,et al.  AquaLog: An ontology-driven question answering system for organizational semantic intranets , 2007, J. Web Semant..

[70]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[71]  Werner Nutt,et al.  Enabling Fine-Grained RDF Data Completeness Assessment , 2016, ICWE.

[72]  Amit P. Sheth,et al.  FACES: Diversity-Aware Entity Summarization Using Incremental Hierarchical Conceptual Clustering , 2015, AAAI.

[73]  Enrico Motta,et al.  Ontology-Driven Question Answering in AquaLog , 2004, NLDB.

[74]  Mark Davis,et al.  Tags for Identifying Languages , 2009, RFC.

[75]  Dieter Fensel,et al.  Ontobroker: The Very High Idea , 1998, FLAIRS.

[76]  Heiko Paulheim,et al.  Using Graph Metrics for Linked Open Data Enabled Recommender Systems , 2015, EC-Web.

[77]  Miguel-Ángel Sicilia,et al.  A survey of approaches for ranking on the web of data , 2014, Information Retrieval.

[78]  Slim Abdennadher,et al.  Collecting Links between Entities Ranked by Human Association Strengths , 2013, ESWC.

[79]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[80]  Nikos Mamoulis,et al.  Diverse and Proportional Size-l Object Summaries for Keyword Search , 2015, SIGMOD Conference.

[81]  Haofen Wang,et al.  Snippet Generation for Semantic Web Search Engines , 2008, ASWC.

[82]  Andreas Harth,et al.  Performing Object Consolidation on the Semantic Web Data Graph , 2007, I3.

[83]  Ted Dunning,et al.  Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.

[84]  Thomas Pellissier Tanon,et al.  From Freebase to Wikidata: The Great Migration , 2016, WWW.

[85]  Peter Mika,et al.  Ad-hoc object retrieval in the web of data , 2010, WWW '10.

[86]  Heiko Paulheim,et al.  What is Special about Bethlehem, Pennsylvania? Identifying Unusual Facts about DBpedia Entities , 2015, International Semantic Web Conference.

[87]  Yuzhong Qu,et al.  How Matchable Are Four Thousand Ontologies on the Semantic Web , 2011, ESWC.

[88]  Paul Buitelaar,et al.  Towards the Multilingual Semantic Web: Principles, Methods and Applications , 2014 .

[89]  Roy T. Fielding,et al.  Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content , 2014, RFC.

[90]  Marti A. Hearst,et al.  Flexible Search and Navigation using Faceted Metadata , 2002 .

[91]  Roy T. Fielding,et al.  Uniform Resource Identifier (URI): Generic Syntax , 2005, RFC.

[92]  Markus Krötzsch,et al.  Reifying RDF: What Works Well With Wikidata? , 2015, SSWS@ISWC.

[93]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[94]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[95]  Andrea Giovanni Nuzzolese,et al.  Aemoo: Linked Data exploration based on Knowledge Patterns , 2016, Semantic Web.

[96]  Yuzhong Qu,et al.  Summarizing Entity Descriptions for Effective and Efficient Human-centered Entity Linking , 2015, WWW.

[97]  Aidan Hogan,et al.  ReConRank: A Scalable Ranking Method for Semantic Web Data with Context , 2006 .

[98]  Achim Rettinger,et al.  PageRank on Wikipedia: Towards General Importance Scores for Entities , 2016, @ESWC.

[99]  Andreas Wagner,et al.  Browsing-Oriented Semantic Faceted Search , 2011, DEXA.

[100]  Martin Hepp,et al.  Games with a Purpose for the Semantic Web , 2008, IEEE Intelligent Systems.

[101]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[102]  Andreas Harth,et al.  TOPDIS: Tensor-based Ranking for Data Search and Navigation , 2009 .

[103]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[104]  Jure Leskovec,et al.  Human wayfinding in information networks , 2012, WWW.

[105]  Ricardo A. Baeza-Yates,et al.  Web page ranking using link attributes , 2004, WWW Alt. '04.

[106]  Jens Lehmann,et al.  What Have Innsbruck and Leipzig in Common? Extracting Semantics from Wiki Content , 2007, ESWC.

[107]  David Pisinger,et al.  The quadratic knapsack problem - a survey , 2007, Discret. Appl. Math..

[108]  Roi Blanco,et al.  Entity Recommendations in Web Search , 2013, SEMWEB.

[109]  James A. Thom,et al.  A model for ranking entity attributes using DBpedia , 2014, Aslib J. Inf. Manag..

[110]  Roelof van Zwol,et al.  Ranking Entity Facets Based on User Click Feedback , 2010, 2010 IEEE Fourth International Conference on Semantic Computing.

[111]  Wolf-Tilo Balke,et al.  A Chip Off the Old Block - Extracting Typical Attributes for Entities Based on Family Resemblance , 2015, DASFAA.

[112]  Marcin Sydow,et al.  To Diversify or Not to Diversify Entity Summaries on RDF Knowledge Graphs? , 2011, ISMIS.

[113]  Rudi Studer,et al.  Semantic Search - Using Graph-Structured Semantic Models for Supporting the Search Process , 2009, ICCS.

[114]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[115]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[116]  Jure Leskovec,et al.  Impact of Linguistic Analysis on the Semantic Graph Coverage and Learning of Document Extracts , 2005, AAAI.

[117]  Nelia Lasierra,et al.  LinkSUM: Using Link Analysis to Summarize Entity Data , 2016, ICWE.

[118]  Roy T. Fielding,et al.  Hypertext Transfer Protocol - HTTP/1.1 , 1997, RFC.

[119]  Siegfried Handschuh,et al.  Learning from Linked Open Data Usage: Patterns & Metrics , 2010 .

[120]  Ioan Toma,et al.  Towards a formal model for sharing and reusing ranking computations , 2012 .

[121]  Andreas Harth,et al.  VisiNav: A system for visual search and navigation on web data , 2010, J. Web Semant..

[122]  Xiang Zhang,et al.  Ontology summarization based on rdf sentence graph , 2007, WWW '07.

[123]  Michael Günther,et al.  Introducing Wikidata to the Linked Data Web , 2014, SEMWEB.

[124]  D. Fensel,et al.  Architecture of the World Wide Web , Volume One , 2004 .

[125]  Harald Sack,et al.  Collaboratively Patching Linked Data , 2012, ArXiv.

[126]  Feng Chu,et al.  An effective GRASP and tabu search for the 0-1 quadratic knapsack problem , 2013, Comput. Oper. Res..

[127]  Sharon L. Milgram,et al.  The Small World Problem , 1967 .

[128]  Dunja Mladenic,et al.  Semantic Graphs Derived From Triplets with Application in Document Summarization , 2009, Informatica.

[129]  Markus Krötzsch,et al.  Semantic Wikipedia , 2006, WikiSym '06.

[130]  Marko Grobelnik,et al.  Learning Sub-structures of Document Semantic Graphs for Document Summarization , 2004 .

[131]  Yuzhong Qu,et al.  C3D+P: A summarization method for interactive entity resolution , 2015, J. Web Semant..

[132]  Bernardo Cuenca Grau,et al.  OWL 2 Web Ontology Language: Profiles , 2009 .

[133]  Wei Zhang,et al.  Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources , 2015, Proc. VLDB Endow..

[134]  Harald Sack,et al.  FRanCo - A Ground Truth Corpus for Fact Ranking Evaluation , 2015, SumPre-HSWI@ESWC.

[135]  Sebastian Speiser,et al.  Integrating Linked Data and Services with Linked Data Services , 2011, ESWC.

[136]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[137]  Harald Sack,et al.  Towards exploratory video search using linked data , 2009, 2009 11th IEEE International Symposium on Multimedia.

[138]  Wei Zhang,et al.  From Data Fusion to Knowledge Fusion , 2014, Proc. VLDB Endow..

[139]  Timothy W. Finin,et al.  Swoogle: a search and metadata engine for the semantic web , 2004, CIKM '04.

[140]  Divesh Srivastava,et al.  Incremental Record Linkage , 2014, Proc. VLDB Endow..

[141]  Axel-Cyrille Ngonga Ngomo,et al.  ASSESS - Automatic Self-Assessment Using Linked Data , 2015, SEMWEB.

[142]  Julia Hoxha,et al.  Cross-domain Recommendations based on semantically-enhanced User Web Behavior , 2014 .

[143]  Rik Van de Walle,et al.  Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data , 2016, HCI.

[144]  Roi Blanco,et al.  Effective and Efficient Entity Search in RDF Data , 2011, SEMWEB.

[145]  Amit P. Sheth,et al.  Don't like RDF reification?: making statements about statements using singleton property , 2014, WWW.

[146]  Achim Rettinger,et al.  Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO , 2017, Semantic Web.

[147]  Marc Ehrig,et al.  Ontology Alignment: Bridging the Semantic Gap , 2006 .

[148]  Stefan Decker,et al.  Hierarchical Link Analysis for Ranking Web Data , 2010, ESWC.

[149]  Harald Sack,et al.  WhoKnows? Evaluating linked data heuristics with a quiz that cleans up DBpedia , 2011, Interact. Technol. Smart Educ..

[150]  Steffen Stadtmüller,et al.  SUMMA: A Common API for Linked Data Entity Summaries , 2015, ICWE.

[151]  Enrico Motta,et al.  PowerAqua: Fishing the Semantic Web , 2006, ESWC.

[152]  Asunción Gómez-Pérez,et al.  Why Evaluate Ontology Technologies? Because It Works! , 2004, IEEE Intell. Syst..

[153]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[154]  Pasi Fränti,et al.  Web Data Mining , 2009, Encyclopedia of Database Systems.

[155]  Dima Shepelyansky,et al.  Wikipedia ranking of world universities , 2015, The European Physical Journal B.

[156]  Dima Shepelyansky,et al.  Interactions of Cultures and Top People of Wikipedia from Ranking of 24 Language Editions , 2014, PloS one.

[157]  Fabien L. Gandon,et al.  RDF 1.1 XML Syntax , 2014 .

[158]  Gary Marchionini,et al.  Toward a General Relation Browser: A GUI for Information Architects , 2003 .

[159]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[160]  Dimitris Plexousakis,et al.  RDF Digest: Efficient Summarization of RDF/S KBs , 2015, ESWC.

[161]  Seung-won Hwang,et al.  Attribute extraction and scoring: A probabilistic approach , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[162]  Pascal Hitzler,et al.  String Similarity Metrics for Ontology Alignment , 2013, SEMWEB.

[163]  Heiko Paulheim,et al.  Type Inference on Noisy RDF Data , 2013, SEMWEB.

[164]  Markus Strohmaier,et al.  Visual Positions of Links and Clicks on Wikipedia , 2016, WWW.

[165]  V. R. Benjamins,et al.  WonderTools? A comparative study of ontological engineering tools , 2000, Int. J. Hum. Comput. Stud..

[166]  Martin J. Dürst,et al.  Internationalized Resource Identifiers (IRIs) , 2005, RFC.

[167]  Raphaël Troncy,et al.  What Are the Important Properties of an Entity? - Comparing Users and Knowledge Graph Point of View , 2014, ESWC.

[168]  Yuzhong Qu,et al.  RELIN: Relatedness and Informativeness-Based Centrality for Entity Summarization , 2011, International Semantic Web Conference.

[169]  Eero Hyvönen,et al.  MuseumFinland - Finnish museums on the semantic web , 2005, J. Web Semant..

[170]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[171]  Pieterjan De Potter,et al.  Everything is Connected: Using Linked Data for Multimedia Narration of Connections between Concepts , 2012, International Semantic Web Conference.

[172]  Hannah Bast,et al.  Relevance Scores for Triples from Type-Like Relations , 2015, SIGIR.

[173]  Sebastiano Vigna,et al.  Axioms for Centrality , 2013, Internet Math..

[174]  Georgios John Fakas A novel keyword search paradigm in relational databases: Object summaries , 2011, Data Knowl. Eng..

[175]  James A. Hendler,et al.  Dagstuhl-Seminar: Semantics for the WWW , 2000 .

[176]  Gerhard Weikum,et al.  Einstein: physicist or vegetarian? summarizing semantic type graphs for knowledge discovery , 2011, WWW.

[177]  Wei-Ying Ma,et al.  Object-level ranking: bringing order to Web objects , 2005, WWW '05.

[178]  Nikos Mamoulis,et al.  Versatile Size-$l$ Object Summaries for Relational Keyword Search , 2014, IEEE Transactions on Knowledge and Data Engineering.

[179]  Roi Blanco,et al.  Repeatable and reliable semantic search evaluation , 2013, J. Web Semant..

[180]  A. Hotho,et al.  HypTrails: A Bayesian Approach for Comparing Hypotheses About Human Trails on the Web , 2014, WWW.

[181]  Serge Abiteboul,et al.  PARIS: Probabilistic Alignment of Relations, Instances, and Schema , 2011, Proc. VLDB Endow..

[182]  Pablo N. Mendes,et al.  Improving efficiency and accuracy in multilingual entity extraction , 2013, I-SEMANTICS '13.

[183]  Pascal Hitzler,et al.  Logical Linked Data Compression , 2013, ESWC.

[184]  Wolfgang Nejdl,et al.  Efficient Parallel Computation of PageRank , 2006, ECIR.

[185]  Andreas Hotho,et al.  Towards Semantic Web Mining , 2002, SEMWEB.

[186]  F. Maxwell Harper,et al.  The MovieLens Datasets: History and Context , 2016, TIIS.