Inductive learning for the Semantic Web: What does it buy?

Nowadays, building ontologies is a time consuming task since they are mainly manually built. This makes hard the full realization of the Semantic Web view. In order to overcome this issue, machine learning techniques, and specifically inductive learning methods, could be fruitfully exploited for learning models from existing Web data. In this paper we survey methods for (semi-)automatically building and enriching ontologies from existing sources of information such as Linked Data, tagged data, social networks, ontologies. In this way, a large amount of ontologies could be quickly available and possibly only refined by the knowledge engineers. Furthermore, inductive incremental learning techniques could be adopted to perform reasoning at large scale, for which the deductive approach has showed its limitations. Indeed, incremental methods allow to learn models from samples of data and then to refine/enrich the model when new (samples of) data are available. If on one hand this means to abandon sound and complete reasoning procedures for the advantage of uncertain conclusions, on the other hand this could allow to reason on the entire Web. Besides, the adoption of inductive learning methods could make also possible to dial with the intrinsic uncertainty characterizing the Web, that, for its nature, could have incomplete and/or contradictory information.

[1]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[2]  R. Studer,et al.  Semantic Web Technologies: Trends and Research in Ontology-based Systems , 2006 .

[3]  James C. Bezdek,et al.  Cluster validation with generalized Dunn's indices , 1995, Proceedings 1995 Second New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems.

[4]  P. Buitelaar,et al.  Ontology Learning and Population: Bridging the Gap between Text and Knowledge - Volume 167 Frontiers in Artificial Intelligence and Applications , 2008 .

[5]  Achim Rettinger,et al.  Towards Machine Learning on the Semantic Web , 2008, URSW.

[6]  Nada Lavrac,et al.  The Multi-Purpose Incremental Learning System AQ15 and Its Testing Application to Three Medical Domains , 1986, AAAI.

[7]  Stephan Bloehdorn,et al.  Kernel Methods for Mining Instance Data in Ontologies , 2007, ISWC/ASWC.

[8]  Huan Liu,et al.  Handling concept drifts in incremental learning with support vector machines , 1999, KDD '99.

[9]  Umberto Straccia,et al.  Managing uncertainty and vagueness in description logics for the Semantic Web , 2008, J. Web Semant..

[10]  Jens Lehmann,et al.  Concept learning in description logics using refinement operators , 2009, Machine Learning.

[11]  Nicola Fanizzi,et al.  Analogical Reasoning in Description Logics , 2008, URSW.

[12]  S. Griffis EDITOR , 1997, Journal of Navigation.

[13]  Diego Calvanese,et al.  The Description Logic Handbook , 2007 .

[14]  Thomas Lukasiewicz Uncertainty Reasoning for the Semantic Web , 2009, RR.

[15]  Michel C. A. Klein,et al.  Rough Description Logics for Modeling Uncertainty in Instance Unification , 2007, URSW.

[16]  Ulrich Güntzer,et al.  Algorithms for association rule mining — a general survey and comparison , 2000, SKDD.

[17]  Katharina Morik,et al.  A Polynomial Approach to the Constructive Induction of Structural Knowledge , 2004, Machine Learning.

[18]  Nicola Fanizzi,et al.  Instance-Based Query Answering with Semantic Knowledge Bases , 2007, AI*IA.

[19]  Nicola Fanizzi,et al.  Query Answering and Ontology Population: An Inductive Approach , 2008, ESWC.

[20]  Rémi Gilleron,et al.  Positive and Unlabeled Examples Help Learning , 1999, ALT.

[21]  Wendy Hall,et al.  The Semantic Web Revisited , 2006, IEEE Intelligent Systems.

[22]  Luigi Iannone,et al.  An algorithm based on counterfactuals for concept learning in the Semantic Web , 2005, Applied Intelligence.

[23]  Nicola Fanizzi,et al.  Statistical Learning for Inductive Query Answering on OWL Ontologies , 2008, SEMWEB.

[24]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[25]  Bernhard Ganter,et al.  Completing Description Logic Knowledge Bases Using Formal Concept Analysis , 2007, IJCAI.

[26]  Thomas Lukasiewicz,et al.  Tractable Reasoning with Bayesian Description Logics , 2008, SUM.

[27]  Nicola Fanizzi,et al.  Towards the induction of terminological decision trees , 2010, SAC '10.

[28]  York Sure-Vetter,et al.  Learning Disjointness , 2007, ESWC.

[29]  Thomas Lukasiewicz,et al.  Expressive probabilistic description logics , 2008, Artif. Intell..

[30]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[31]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[32]  Nicola Fanizzi,et al.  A Declarative Kernel for ALC Concept Descriptions , 2006, ISMIS.

[33]  Ryszard S. Michalski,et al.  Automated Construction of Classifications: Conceptual Clustering Versus Numerical Taxonomy , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[35]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[36]  Zhi-Hua Zhou,et al.  Exploratory Under-Sampling for Class-Imbalance Learning , 2006, Sixth International Conference on Data Mining (ICDM'06).

[37]  Umberto Straccia,et al.  Reasoning within Fuzzy Description Logics , 2011, J. Artif. Intell. Res..

[38]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[39]  Wanli Zuo,et al.  Learning from Positive and Unlabeled Examples: A Survey , 2008, 2008 International Symposiums on Information Processing.

[40]  M. Hadzikadic,et al.  Concept Formation by Incremental Conceptual Clustering , 1989, IJCAI.

[41]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..

[42]  Johanna Völker,et al.  Acquisition of OWL DL Axioms from Lexical Resources , 2007, ESWC.

[43]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[44]  Georg Gottlob,et al.  Combining Semantic Web Search with the Power of Inductive Reasoning , 2009, URSW.

[45]  Herna L. Viktor,et al.  Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach , 2004, SKDD.

[46]  Nicola Fanizzi,et al.  Conceptual Clustering and Its Application to Concept Drift and Novelty Detection , 2008, ESWC.

[47]  Ping Yao Comparative Study on Class Imbalance Learning for Credit Scoring , 2009, 2009 Ninth International Conference on Hybrid Intelligent Systems.

[48]  Dunja Mladenic,et al.  Knowledge Discovery for Ontology Construction , 2006 .

[49]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[50]  Myra Spiliopoulou,et al.  MONIC: modeling and monitoring cluster transitions , 2006, KDD '06.