Induction of Classifiers through Non-Parametric Methods for Approximate Classification and Retrieval with Ontologies

This work concerns non-parametric approaches for statistical learning applied to the standard knowledge representation languages adopted in the Semantic Web context. We present methods based on epistemic inference that are able to elicit and exploit the semantic similarity of individuals in OWL knowledge bases. Specifically, a totally semantic and language-independent semi-distance function is introduced, whence also an epistemic kernel function for Semantic Web representations is derived. Both the measure and the kernel function are embedded in non-parametric statistical learning algorithms customized for coping with Semantic Web representations. Particularly, the measure is embedded in a k-Nearest Neighbor algorithm and the kernel function is embedded in a Support Vector Machine. The implemented algorithms are used to perform inductive concept retrieval and query answering. An experimentation on real ontologies proves that the methods can be effectively employed for performing the target tasks, and moreover that it is possible to induce new assertions that are not logically derivable.

[1]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[2]  Nicola Fanizzi,et al.  Induction of Optimal Semi-distances for Individuals based on Feature Sets , 2007, Description Logics.

[3]  Pavel Zezula,et al.  Similarity Search: The Metric Space Approach (Advances in Database Systems) , 2005 .

[4]  Diego Calvanese,et al.  The Description Logic Handbook , 2007 .

[5]  Nicola Fanizzi,et al.  Evolutionary Conceptual Clustering of Semantically Annotated Resources , 2007, International Conference on Semantic Computing (ICSC 2007).

[6]  Thomas Gärtner,et al.  A survey of kernels for structured data , 2003, SKDD.

[7]  Luc De Raedt,et al.  Kernels and Distances for Structured Data , 2008 .

[8]  Nicola Fanizzi,et al.  Reasoning by Analogy in Description Logics Through Instance-based Learning , 2006, SWAP.

[9]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[10]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[11]  Pavel Zezula,et al.  Similarity Search - The Metric Space Approach , 2005, Advances in Database Systems.

[12]  Michèle Sebag,et al.  Distance Induction in First Order Logic , 1997, ILP.

[13]  Katharina Morik,et al.  A Polynomial Approach to the Constructive Induction of Structural Knowledge , 2004, Machine Learning.

[14]  Alexander Borgida,et al.  Towards Measuring Similarity in Description Logics , 2005, Description Logics.

[15]  Luigi Iannone,et al.  Knowledge-Intensive Induction of Terminologies from Metadata , 2004, SEMWEB.

[16]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[17]  Bernhard Ganter,et al.  Completing Description Logic Knowledge Bases Using Formal Concept Analysis , 2007, IJCAI.

[18]  Dan Roth,et al.  On Kernel Methods for Relational Learning , 2003, ICML.

[19]  Robert L. Goldstone,et al.  Similarity in context , 1997, Memory & cognition.

[20]  Luc De Raedt,et al.  kFOIL: Learning Simple Relational Kernels , 2006, AAAI.

[21]  Mathieu d'Aquin,et al.  Decentralized Case-Based Reasoning for the Semantic Web , 2005, SEMWEB.

[22]  Nicola Fanizzi,et al.  A dissimilarity measure for ALC concept descriptions , 2006, SAC '06.

[23]  Stephan Bloehdorn,et al.  Kernel Methods for Mining Instance Data in Ontologies , 2007, ISWC/ASWC.

[24]  Nicola Fanizzi,et al.  A Declarative Kernel for ALC Concept Descriptions , 2006, ISMIS.

[25]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[26]  Nicola Fanizzi,et al.  Randomized metric induction and evolutionary conceptual clustering for semantic knowledge bases , 2007, CIKM '07.

[27]  William W. Cohen,et al.  Learning the Classic Description Logic: Theoretical and Experimental Results , 1994, KR.