Relatedness between vocabularies on the Web of data: A taxonomy and an empirical study

Given thousands of vocabularies published and used on the Web of data, the sociology of vocabulary creation and application is receiving increasing attention, which studies the statistical features of and the relations between vocabularies from various sources. In this article, we tackle a taxonomy of relatedness between vocabularies, comprising declarative, topical and distributional perspectives, which are derived from the structural description, textual description and context of use of a vocabulary, respectively. We characterize each perspective by using a graph model representing vocabularies and their relatedness, and implement it over a data set containing 2996 vocabularies and 4.1 billion RDF triples, based on which we perform degree, connectivity and cluster analysis. We also discuss the correlation between different perspectives. The results and findings are expected to be useful for future research and development on vocabularies.

[1]  Stefan Schlobach,et al.  An Empirical Study of Instance-Based Ontology Matching , 2007, ISWC/ASWC.

[2]  Andreas Hotho,et al.  Semantic Network Analysis of Ontologies , 2006, LWA.

[3]  Yuzhong Qu,et al.  An Empirical Study of Vocabulary Relatedness and Its Application to Recommender Systems , 2011, International Semantic Web Conference.

[4]  Mark A. Musen,et al.  The PROMPT suite: interactive tools for ontology merging and mapping , 2003, Int. J. Hum. Comput. Stud..

[5]  Mark A. Musen,et al.  What Four Million Mappings Can Tell You about Two Hundred Ontologies , 2009, SEMWEB.

[6]  Vassilis Christophides,et al.  Ieee Transactions on Knowledge and Data Engineering on Graph Features of Semantic Web Schemas , 2022 .

[7]  Lada A. Adamic,et al.  Internet: Growth dynamics of the World-Wide Web , 1999, Nature.

[8]  Dmitri Loguinov,et al.  IRLbot: scaling to 6 billion pages and beyond , 2008, WWW.

[9]  Stefanos D. Kollias,et al.  A String Metric for Ontology Alignment , 2005, SEMWEB.

[10]  Jérôme David,et al.  Ontology Similarity in the Alignment Space , 2010, International Semantic Web Conference.

[11]  Li Ding,et al.  Characterizing the Semantic Web on the Web , 2006, SEMWEB.

[12]  Danushka Bollegala,et al.  Measuring semantic similarity between words using web search engines , 2007, WWW '07.

[13]  Jérôme Euzenat,et al.  A Feature and Information Theoretic Framework for Semantic Similarity and Relatedness , 2010, SEMWEB.

[14]  Jérôme David,et al.  Comparison between Ontology Distances (Preliminary Results) , 2008, SEMWEB.

[15]  Enrico Motta,et al.  Impact of Using Relationships between Ontologies to Enhance the Ontology Search Results , 2012, ESWC.

[16]  Laurent Mazuel,et al.  Semantic Relatedness Measure Using Object Properties in an Ontology , 2008, SEMWEB.

[17]  Elena Paslaru Bontas Simperl,et al.  Labels in the Web of Data , 2011, SEMWEB.

[18]  Mathieu d'Aquin,et al.  Where to publish and find ontologies? A survey of ontology libraries , 2012, J. Web Semant..

[19]  Marián Boguñá,et al.  Decoding the structure of the WWW: A comparative analysis of Web crawls , 2007, TWEB.

[20]  Yuzhong Qu,et al.  Term Dependence on the Semantic Web , 2008, SEMWEB.

[21]  Debora Donato,et al.  The Web as a graph: How far we are , 2007, TOIT.

[22]  James A. Hendler,et al.  A Survey of the Web Ontology Landscape , 2006, SEMWEB.

[23]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  Yun Peng,et al.  Finding and Ranking Knowledge on the Semantic Web , 2005, SEMWEB.

[25]  Steffen Staab,et al.  On How to Perform a Gold Standard Based Evaluation of Ontology Learning , 2006, SEMWEB.

[26]  Andrei Z. Broder,et al.  Graph structure in the Web , 2000, Comput. Networks.

[27]  Yuzhong Qu,et al.  How Matchable Are Four Thousand Ontologies on the Semantic Web , 2011, ESWC.

[28]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[29]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[30]  Frank van Harmelen,et al.  Using Google distance to weight approximate ontology matches , 2007, WWW '07.

[31]  Jürgen Umbrich,et al.  MultiCrawler: A Pipelined Architecture for Crawling and Indexing Semantic Web Data , 2006, SEMWEB.

[32]  Vassilis Christophides,et al.  Benchmarking RDF Schemas for the Semantic Web , 2002, SEMWEB.

[33]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[34]  Christos Faloutsos,et al.  Graph mining: Laws, generators, and algorithms , 2006, CSUR.

[35]  Enrico Motta,et al.  A Platform for Semantic Web Studies , 2010 .

[36]  Graeme Hirst,et al.  Distributional measures of concept-distance: A task-oriented evaluation , 2006, EMNLP.

[37]  Robert Stevens,et al.  The Current State of SKOS Vocabularies on the Web , 2012, ESWC.

[38]  Steffen Staab,et al.  Measuring Similarity between Ontologies , 2002, EKAW.