Mining Linked Open Data through Semi-supervised Learning Methods Based on Self-Training

The paper tackles the problem of mining linked open data. The inherent lack of knowledge caused by the open-world assumption made on the semantic of the data model determines an abundance of data of uncertain classification. We present a semi-supervised machine learning approach. Specifically a self-training strategy is adopted which iteratively uses labeled instances to predict a label also for unlabeled instances. The approach is empirically evaluated with an extensive experimentation involving several different algorithms demonstrating the added value yielded by a semi-supervised approach over standard supervised methods.

[1]  J. Euzenat,et al.  Ontology Matching , 2007, Springer Berlin Heidelberg.

[2]  Abraham Bernstein,et al.  Adding Data Mining Support to SPARQL Via Statistical Relational Learning Methods , 2008, ESWC.

[3]  Ian H. Witten,et al.  Clustering Documents with Active Learning Using Wikipedia , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[4]  Werner Nutt,et al.  An Epistemic Operator for Description Logics , 1998, Artif. Intell..

[5]  Maria-Esther Vidal,et al.  BioNav: An Ontology-Based Framework to Discover Semantic Links in the Cloud of Linked Data , 2010, ESWC.

[6]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[7]  Andreas Hotho,et al.  Towards Semantic Web Mining , 2002, SEMWEB.

[8]  Nicola Fanizzi,et al.  Statistical Learning for Inductive Query Answering on OWL Ontologies , 2008, SEMWEB.

[9]  Edna Ruckhaus,et al.  The BAY-HIST Prediction Model for RDF Documents , 2010 .

[10]  Nicola Fanizzi,et al.  Inductive Classification of Semantically Annotated Resources through Reduced Coulomb Energy Networks , 2009, Int. J. Semantic Web Inf. Syst..

[11]  Karl Rihaczek,et al.  1. WHAT IS DATA MINING? , 2019, Data Mining for the Social Sciences.

[12]  Achim Rettinger,et al.  Materializing and Querying Learned Knowledge , 2009 .

[13]  Achim Rettinger,et al.  Mining the Semantic Web , 2012, Data Mining and Knowledge Discovery.

[14]  Volker Tresp,et al.  Mining the Semantic Web Statistical Learning for Next Generation Knowledge Bases , 2012 .

[15]  Achim Rettinger,et al.  Statistical Relational Learning with Formal Ontologies , 2009, ECML/PKDD.

[16]  Tim Berners-Lee,et al.  Linked data , 2020, Semantic Web for the Working Ontologist.

[17]  Nicola Fanizzi,et al.  A Note on the Evaluation of Inductive Concept Classification Procedures , 2008, SWAP.

[18]  Nicola Fanizzi,et al.  Query Answering and Ontology Population: An Inductive Approach , 2008, ESWC.

[19]  Edna Ruckhaus,et al.  A Ranking-Based Approach to Discover Semantic Associations Between Linked Data Marı́a - , 2010 .

[20]  Achim Rettinger,et al.  Towards Machine Learning on the Semantic Web , 2008, URSW.