Statistical Knowledge Patterns: Identifying Synonymous Relations in Large Linked Datasets

The Web of Data is a rich common resource with billions of triples available in thousands of datasets and individual Web documents created by both expert and non-expert ontologists. A common problem is the imprecision in the use of vocabularies: annotators can misunderstand the semantics of a class or property or may not be able to find the right objects to annotate with. This decreases the quality of data and may eventually hamper its usability over large scale. This paper describes Statistical Knowledge Patterns (SKP) as a means to address this issue. SKPs encapsulate key information about ontology classes, including synonymous properties in (and across) datasets, and are automatically generated based on statistical data analysis. SKPs can be effectively used to automatically normalise data, and hence increase recall in querying. Both pattern extraction and pattern usage are completely automated. The main benefits of SKPs are that: (1) their structure allows for both accurate query expansion and restriction; (2) they are context dependent, hence they describe the usage and meaning of properties in the context of a particular class; and (3) they can be generated offline, hence the equivalence among relations can be used efficiently at run time.

[1]  Abraham Bernstein,et al.  The Semantic Web - ISWC 2009, 8th International Semantic Web Conference, ISWC 2009, Chantilly, VA, USA, October 25-29, 2009. Proceedings , 2009, SEMWEB.

[2]  Andrea Giovanni Nuzzolese,et al.  Encyclopedic Knowledge Patterns from Wikipedia Links , 2011, SEMWEB.

[3]  Craig A. Knoblock,et al.  Discovering Concept Coverings in Ontologies of Linked Data Sources , 2012, International Semantic Web Conference.

[4]  Isabelle Augenstein,et al.  Mapping Keywords to Linked Data Resources for Automatic Query Expansion , 2013, KNOW@LOD.

[5]  Stefan Schlobach,et al.  Instance-Based Ontology Matching by Instance Enrichment , 2012, Journal on Data Semantics.

[6]  Aldo Gangemi,et al.  Pattern-Based Ontology Design , 2012, Ontology Engineering in a Networked World.

[7]  Lora Aroyo,et al.  Proceedings of the 10th international conference on The semantic web - Volume Part II , 2011 .

[8]  Achille Fokoue,et al.  Instance-Based Matching of Large Ontologies Using Locality-Sensitive Hashing , 2012, SEMWEB.

[9]  Isabelle Mirbel,et al.  DFS-based frequent graph pattern extraction to characterize the content of RDF Triple Stores , 2010 .

[10]  Aldo Gangemi,et al.  Ontology Design Patterns , 2005 .

[11]  Lora Aroyo,et al.  The Semantic Web - ISWC 2011 - 10th International Semantic Web Conference, Bonn, Germany, October 23-27, 2011, Proceedings, Part I , 2011, SEMWEB.

[12]  Lora Aroyo,et al.  Extracting Core Knowledge from Linked Data , 2011, COLD.

[13]  Isabelle Augenstein,et al.  Statistical Knowledge Patterns for Characterising Linked Data , 2013, WOP.

[14]  Ryutaro Ichise,et al.  Detecting Hidden Relations in Geographic Data , 2010 .

[15]  Eva Blomqvist OntoCase-Automatic Ontology Enrichment Based on Ontology Design Patterns , 2009, International Semantic Web Conference.

[16]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[17]  Aldo Gangemi,et al.  Towards a pattern science for the Semantic Web , 2010, Semantic Web.

[18]  Fabien L. Gandon,et al.  QAKiS @ QALD-2 , 2012, ILD@ESWC.

[19]  Jeff Heflin,et al.  The Semantic Web – ISWC 2012 , 2012, Lecture Notes in Computer Science.

[20]  Asunción Gómez-Pérez,et al.  Ontology Engineering in a Networked World , 2012, Springer Berlin Heidelberg.

[21]  Aldo Gangemi,et al.  Aemoo: Exploratory Search based on KnowledgePatterns over the Semantic Web , 2011 .