A Snapshot of the OWL Web

Tool development for and empirical experimentation in OWL ontology engineering require a wide variety of suitable ontologies as input for testing and evaluation purposes and detailed characterisations of real ontologies. Empirical activities often resort to (somewhat arbitrarily) hand curated corpora available on the web, such as the NCBO BioPortal and the TONES Repository, or manually selected sets of well-known ontologies. Findings of surveys and results of benchmarking activities may be biased, even heavily, towards these datasets. Sampling from a large corpus of ontologies, on the other hand, may lead to more representative results. Current large scale repositories and web crawls are mostly uncurated and suffer from duplication, small and (for many purposes) uninteresting ontology files, and contain large numbers of ontology versions, variants, and facets, and therefore do not lend themselves to random sampling. In this paper, we survey ontologies as they exist on the web and describe the creation of a corpus of OWL DL ontologies using strategies such as web crawling, various forms of de-duplications and manual cleaning, which allows random sampling of ontologies for a variety of empirical applications.

[1]  Axel Polleres,et al.  OWL: Yet to arrive on the Web of Data? , 2012, LDOW.

[2]  Brian Davis,et al.  Knowledge Engineering and Knowledge Management , 2012, Lecture Notes in Computer Science.

[3]  Yun Peng,et al.  Swoogle: A semantic web search and metadata engine , 2004, CIKM 2004.

[4]  Ian Horrocks,et al.  The Even More Irresistible $\mathcal{SROIQ}$ , 2006 .

[5]  Li Ding,et al.  Characterizing the Semantic Web on the Web , 2006, SEMWEB.

[6]  C. Maria Keet Detecting and Revising Flaws in OWL Object Property Expressions , 2012, EKAW.

[7]  Enrico Motta,et al.  Watson: supporting next generation semantic web applications , 2007 .

[8]  Timothy W. Finin,et al.  Swoogle: a search and metadata engine for the semantic web , 2004, CIKM '04.

[9]  Sean Bechhofer,et al.  The OWL API: A Java API for OWL ontologies , 2011, Semantic Web.

[10]  Robert Stevens,et al.  Analysing Syntactic Regularities in Ontologies , 2012, OWLED.

[11]  Boris Motik,et al.  OWL 2: The next step for OWL , 2008, J. Web Semant..

[12]  Shonali Krishnaswamy,et al.  Predicting Reasoning Performance Using Ontology Metrics , 2012, SEMWEB.

[13]  Bijan Parsia,et al.  Extracting Justifications from BioPortal Ontologies , 2012, International Semantic Web Conference.

[14]  Dean Allemang,et al.  The Semantic Web - ISWC 2006, 5th International Semantic Web Conference, ISWC 2006, Athens, GA, USA, November 5-9, 2006, Proceedings , 2006, SEMWEB.

[15]  Diego Calvanese,et al.  The DL-Lite Family and Relations , 2009, J. Artif. Intell. Res..

[16]  Franz Baader,et al.  Pushing the EL Envelope , 2005, IJCAI.

[17]  K. Spackman SNOMED RT and SNOMEDCT. Promise of an international clinical terminology. , 2000, M.D. computing : computers in medical practice.

[18]  James A. Hendler,et al.  A Survey of the Web Ontology Landscape , 2006, SEMWEB.

[19]  Jeff Heflin,et al.  The Semantic Web – ISWC 2012 , 2012, Lecture Notes in Computer Science.

[20]  Allan Third "Hidden semantics": what can we learn from the names in an ontology? , 2012, INLG.

[21]  Richard Power,et al.  Measuring the Understandability of Deduction Rules for OWL , 2012, WoDOOM@EKAW.