Report on the refinement of the proposed models, methods and semantic search

The aim of the INSEMTIVES project is to involve the users more heavily in the generation of semantic contents, i.e., contents with machine processable formal semantics. The goal of Workpackage 2 (Models and Methods for the Creation and Usage of Lightweight, Structured Knowledge) is to develop models and methods for storing and processing these semantics contents produced by the users as well as for helping the user in the annotation process. Because the end user is not supposed to be knowledgeable in the semantic technologies field, these models need to be suitable for storing {\em lightweight} semantic contents that, for example, can be generated by an ordinary user as part of her everyday activities. The previous deliverables of this Workpackage proposed models and methods based on the requirements collected from the use case partners and based on the analysis of the state-of-the-art. These deliverables are: D2.1.1~\cite{D211} (Report on the state-of-the-art and requirements for annotation representation models), D2.1.2~\cite{D212} (Specification of models for representing single-user and community-based annotations of Web resources), D2.2.1~\cite{D221} (Report on methods and algorithms for bootstrapping Semantic Web content from user repositories and reaching consensus on the use of semantics), D2.2.2/D2.2.3~\cite{D222} (Report on methods and algorithms for linking user-generated semantic annotations to Semantic Web and supporting their evolution in time), D2.3.1~\cite{D231} (Requirements for information retrieval (IR) methods for semantic content), and D2.3.2~\cite{D232} (Specification of information retrieval (IR) methods for semantic content). The proposed models and methods were then validated against evolved requirements from the use case partners and the areas of refinements were identified. This deliverable provides a detailed account on the results of the validation and on the refinements that need to be introduced to the models and to the algorithms. In particular, the following algorithms are detailed in this deliverable: (i) the semantic convergence algorithm that supports the computation of concepts from user annotations and positioning of these concepts in an ontology; (ii) the annotation evolution algorithm that supports the recomputation of links from annotations to the underlying ontology as the ontology evolves; (iii) the summarization algorithm that is capable of computing short summaries for concepts from the ontology to help users decide which concepts to use in the annotation process; (iv) semantic search algorithm that uses the underlying ontology in order to provide the user with more relevant results. The algorithms are described at the reproducible level of details and their relation to the state-of-the-art is reported, whenever possible. The deliverable also presents a platform for creating golden standards for semantic annotation systems and describes a golden standard dataset that was created using the platform and that was used for the evaluation of some of the proposed algorithms. To the best of our knowledge, it is the first attempt to develop such a platform that would facilitate the creation of golden standard datasets for annotation systems in the Semantic Web community. The aforementioned dataset is exported to RDF and is currently undergoing the process of its inclusion to the Linking Open Data could. The platform and the dataset represent a valuable contribution to the community, where the need for golden standard datasets, which can be used for a comparative analysis of existing approaches, has been realised. The deliverable is the concluding deliverable on annotation models and methods in Workpackage 2. Further possible refinements of the models and methods will be reported in publications in scientific conferences, journals, and other venues.

[1]  Ying Zhou,et al.  An Integrated Approach to Extracting Ontological Structures from Folksonomies , 2009, ESWC.

[2]  Jon M. Kleinberg,et al.  Mapping the world's photos , 2009, WWW '09.

[3]  Yong Yu,et al.  Emergent Semantics from Folksonomies: A Quantitative Study , 2006, J. Data Semant..

[4]  Enrico Motta,et al.  Integrating Folksonomies with the Semantic Web , 2007, ESWC.

[5]  Céline Van Damme,et al.  FolksOntology : An Integrated Approach for Turning Folksonomies into Ontologies , 2007 .

[6]  Fausto Giunchiglia,et al.  Social tagging: Semantics are actually used , 2008 .

[7]  Suresh Manandhar,et al.  Extending a Lexical Ontology by a Combination of Distributional Semantics Signatures , 2002, EKAW.

[8]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[9]  Oscar Corcho,et al.  Preliminary Results in Tag Disambiguation using DBpedia , 2009 .

[10]  Suresh Manandhar,et al.  Proposal for Evaluating Ontology Refinement Methods , 2002, LREC.

[11]  Peter Mika Ontologies Are Us: A Unified Model of Social Networks and Semantics , 2005, International Semantic Web Conference.

[12]  Enrico Motta,et al.  Semantically enriching folksonomies with FLOR , 2008 .

[13]  Borislav Popov,et al.  Report on the State-of-the-Art and Requirements for Annotation Representation Models , 2010 .

[14]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[15]  C. Bauckhage,et al.  Analyzing Social Bookmarking Systems : A del . icio . us Cookbook , 2008 .

[16]  Ciro Cattuto,et al.  Semantic Grounding of Tag Relatedness in Social Bookmarking Systems , 2008, SEMWEB.

[17]  Che-Yu Yang,et al.  Word Sense Determination using WordNet and Sense Co-occurrence , 2006, 20th International Conference on Advanced Information Networking and Applications - Volume 1 (AINA'06).

[18]  Ron Artstein,et al.  Survey Article: Inter-Coder Agreement for Computational Linguistics , 2008, CL.

[19]  Alberto Córdoba,et al.  Self-adaptation of Ontologies to Folksonomies in Semantic Web , 2008 .

[20]  Nigel Shadbolt,et al.  Understanding the Semantics of Ambiguous Tags in Folksonomies , 2007, ESOE.

[21]  Yanchun Zhang,et al.  Achieving convergence, causality preservation, and intention preservation in real-time cooperative editing systems , 1998, TCHI.

[22]  Udo Hahn,et al.  Towards Text Knowledge Engineering , 1998, AAAI/IAAI.

[23]  Kilian Q. Weinberger,et al.  Resolving tag ambiguity , 2008, ACM Multimedia.

[24]  Xuanjing Huang,et al.  From Web Directories to Ontologies: Natural Language Processing Challenges , 2007, ISWC/ASWC.

[25]  Pierre Andrews,et al.  Report on methods and algorithms for bootstrapping Semantic Web content from user repositories and reaching consensus on the use of semantics , 2010 .

[26]  Steffen Staab,et al.  Emergent Semantics Principles and Issues , 2004, DASFAA.

[27]  Diego Calvanese,et al.  The description logic handbook: theory , 2003 .

[28]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[29]  Asunción Gómez-Pérez,et al.  Review of the state of the art: discovering and associating semantics to tags in folksonomies , 2012, The Knowledge Engineering Review.

[30]  Pierre Andrews,et al.  Specification of Models for Representing Single-user and Community-based Annotations of Web Resources , 2010 .

[31]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[32]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[33]  Markus Strohmaier,et al.  A call for social tagging datasets , 2010, LINK.

[34]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[35]  Eneko Agirre,et al.  A Proposal for Word Sense Disambiguation using Conceptual Distance , 1995, ArXiv.

[36]  Fausto Giunchiglia,et al.  Lightweight Parsing of Classifications into Lightweight Ontologies , 2010, ECDL.