Approximating Numeric Role Fillers via Predictive Clustering Trees for Knowledge Base Enrichment in the Web of Data

In the context of the Web of Data, plenty of properties may be used for linking resources to other resources but also to literals that specify their attributes. However the scale and inherent nature of the setting is also characterized by a large amount of missing and incorrect information. To tackle these problems, learning models and rules for predicting unknown values of numeric features can be used for approximating the values and enriching the schema of a knowledge base yielding an increase of the expressiveness, e.g. by eliciting SWRL rules. In this work, we tackle the problem of predicting unknown values and deriving rules concerning numeric features expressed as datatype properties. The task can be cast as a regression problem for which suitable solutions have been devised, for instance, in the related context of RDBs. To this purpose, we adapted learning predictive clustering trees for solving multi-target regression problems in the context of knowledge bases of the Web of Data. The approach has been experimentally evaluated showing interesting results.

[1]  Saso Dzeroski,et al.  Fast and Scalable Image Retrieval Using Predictive Clustering Trees , 2013, Discovery Science.

[2]  Liviu Badea,et al.  A Refinement Operator for Description Logics , 2000, ILP.

[3]  H. Lan,et al.  SWRL : A semantic Web rule language combining OWL and ruleML , 2004 .

[4]  Bernard Ženko,et al.  Learning Predictive Clustering Rules , 2005, Informatica.

[5]  Saso Dzeroski,et al.  Tree ensembles for predicting structured outputs , 2013, Pattern Recognit..

[6]  J. Friedman,et al.  Predicting Multivariate Responses in Multiple Linear Regression , 1997 .

[7]  Tapio Elomaa,et al.  Multi-target regression with rule ensembles , 2012, J. Mach. Learn. Res..

[8]  Saso Dzeroski,et al.  Stepwise Induction of Multi-target Model Trees , 2007, ECML.

[9]  Concha Bielza,et al.  A survey on multi‐output regression , 2015, WIREs Data Mining Knowl. Discov..

[10]  Saso Dzeroski,et al.  Hierarchical Multi-classification with Predictive Clustering Trees in Functional Genomics , 2005, EPIA.

[11]  Jens Lehmann,et al.  A Refinement Operator Based Learning Algorithm for the ALC Description Logic , 2007, ILP.

[12]  Hendrik Blockeel,et al.  Top-Down Induction of First Order Logical Decision Trees , 1998, AI Commun..

[13]  Nicola Fanizzi,et al.  Towards Numeric Prediction on OWL Knowledge Bases through Terminological Regression Trees , 2012, 2012 IEEE Sixth International Conference on Semantic Computing.

[14]  Nicola Fanizzi,et al.  Induction of Concepts in Web Ontologies through Terminological Decision Trees , 2010, ECML/PKDD.

[15]  Achim Rettinger,et al.  Mining the Semantic Web , 2012, Data Mining and Knowledge Discovery.

[16]  Jens Lehmann,et al.  DL-Learner: Learning Concepts in Description Logics , 2009, J. Mach. Learn. Res..

[17]  Tom Heath,et al.  Linked Data: Evolving the Web into a Global Data Space , 2011, Linked Data.

[18]  Luc De Raedt,et al.  Top-Down Induction of Clustering Trees , 1998, ICML.