SKOS concepts and natural language concepts: An analysis of latent relationships in KOSs

The vehicle to represent Knowledge Organisation Systems (KOSs) in the environment of the Semantic Web and linked data is the Simple Knowledge Organisation System (SKOS). SKOS provides a way to assign a Uniform Resource Identifier (URI) to each concept, and this URI functions as a surrogate for the concept. This fact makes of main concern the need to clarify the URIs’ ontological meaning. The aim of this study is to investigate the relationship between the ontological substance of KOS concepts and concepts revealed through the grammatical and syntactic formalisms of natural language. For this purpose, we examined the dividableness of concepts in specific KOSs (i.e. a thesaurus, a subject headings system and a classification scheme) by applying Natural Language Processing (NLP) techniques (i.e. morphosyntactic analysis) to the lexical representations (i.e. RDF literals) of SKOS concepts. The results of the comparative analysis reveal that, despite the use of multi-word units, thesauri tend to represent concepts in a way that can hardly be further divided conceptually, while subject headings and classification schemes – to a certain extent – comprise terms that can be decomposed into more conceptual constituents. Consequently, SKOS concepts deriving from thesauri are more likely to represent atomic conceptual units and thus be more appropriate tools for inference and reasoning. Since identifiers represent the meaning of a concept, complex concepts are neither the most appropriate nor the most efficient way of modelling a KOS for the Semantic Web.

[1]  Amit P. Sheth,et al.  Linked Data Is Merely More Data , 2010, AAAI Spring Symposium: Linked Data Meets Artificial Intelligence.

[2]  Corey A. Harper Encoding Library of Congress Subject Headings in SKOS: Authority control for the Semantic Web , 2006, Dublin Core Conference.

[3]  書評・紹介 Martin Haspelmath,Matthew S.Dryer,David Gil and Bernard Comrie(eds.),The World Atlas of Language Structures , 2006 .

[4]  David Gil,et al.  The World Atlas of Language Structures , 2005 .

[5]  Marie-Claude L'Homme,et al.  Terms as labels for concepts, terms as lexical units: A comparative analysis in ontologies and specialized dictionaries , 2012, Appl. Ontology.

[6]  Geert Booij,et al.  The grammar of words : an introduction to linguistic morphology , 2005 .

[7]  Marcia Lei Zeng,et al.  Modeling Classification Systems in SKOS: Some Challenges and Best-Practice Recommendations , 2009, Dublin Core Conference.

[8]  P. Portner,et al.  What is Meaning?: Fundamentals of Formal Semantics , 2005 .

[9]  Frank van Harmelen,et al.  A semantic web primer , 2004 .

[10]  Frehiwot Fisseha,et al.  Reengineering Thesauri for New Applications: The AGROVOC Example , 2006, J. Digit. Inf..

[11]  Xiang Chen,et al.  The Cognitive Structure of Scientific Revolution , 2009 .

[12]  S. Pinker The language instinct : how the mind creates language , 1995 .

[13]  Hong-Gee Kim,et al.  Discovering expansion entities for keyword-based entity search in linked data , 2015, J. Inf. Sci..

[14]  G. Murphy,et al.  The Big Book of Concepts , 2002 .

[15]  Gail Hodge,et al.  Systems of Knowledge Organization for Digital Libraries: Beyond Traditional Authority Files , 2000 .

[16]  Προκόπης Προκοπίδης,et al.  A suite of NLP tools for Greek , 2011 .

[17]  Raf Guns,et al.  Tracing the origins of the semantic web , 2013, J. Assoc. Inf. Sci. Technol..

[18]  Fausto Giunchiglia,et al.  From Knowledge Organization to Knowledge Representation , 2014 .

[19]  Evgeniy Gabrilovich,et al.  Concept-Based Information Retrieval Using Explicit Semantic Analysis , 2011, TOIS.

[20]  Yorick Wilks,et al.  Good and Bad Arguments About Semantic Primitives , 2007 .

[21]  Luisa Alvite Díez,et al.  On the evaluation of thesaurus tools compatible with the Semantic Web , 2014, J. Inf. Sci..

[22]  Paul T. Groth,et al.  A Semantic Web Primer. - 3rd ed. , 2012, CoopIS 2012.

[23]  Marcia Lei Zeng,et al.  Modeling Classification Systems in Multicultural and Multilingual Contexts , 2014 .

[24]  Martin Haspelmath,et al.  Word Classes and Parts of Speech , 2001 .

[25]  Graham Stevens,et al.  What is Meaning? , 2011 .

[26]  Hermann Helbig,et al.  Knowledge Representation and the Semantics of Natural Language , 2005, Cognitive Technologies.

[27]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[28]  Douglas Tudhope,et al.  KOS at your Service: Programmatic Access to Knowledge Organisation Systems , 2004, J. Digit. Inf..

[29]  Chris Fox,et al.  The Handbook of Computational Linguistics and Natural Language Processing , 2010 .

[30]  Birger Hjørland,et al.  Concept theory , 2009, J. Assoc. Inf. Sci. Technol..

[31]  Organización Internacional de Normalización ISO 25964-1 : Information and documentation -- Thesauri and interoperability with other vocabularies -- Part 1: Thesauri for information retrieval , 2011 .

[32]  Fausto Giunchiglia,et al.  Domains and context: First steps towards managing diversity in knowledge , 2012, J. Web Semant..

[33]  Fausto Giunchiglia,et al.  An experiment in managing language diversity across cultures , 2014 .