Inductive Classification Through Evidence-Based Models and Their Ensembles

In the context of Semantic Web, one of the most important issues related to the class-membership prediction task through inductive models on ontological knowledge bases concerns the imbalance of the training examples distribution, mostly due to the heterogeneous nature and the incompleteness of the knowledge bases. An ensemble learning approach has been proposed to cope with this problem. However, the majority voting procedure, exploited for deciding the membership, does not consider explicitly the uncertainty and the conflict among the classifiers of an ensemble model. Moving from this observation, we propose to integrate the Dempster-Shafer DS theory with ensemble learning. Specifically, we propose an algorithm for learning Evidential Terminological Random Forest models, an extension of Terminological Random Forests along with the DS theory. An empirical evaluation showed that: i the resulting models performs better for datasets with a lot of positive and negative examples and have a less conservative behavior than the voting-based forests; ii the new extension decreases the variance of the results.

[1]  Volker Tresp,et al.  Mining the Semantic Web Statistical Learning for Next Generation Knowledge Bases , 2012 .

[2]  Nicola Fanizzi,et al.  Inductive learning for the Semantic Web: What does it buy? , 2010, Semantic Web.

[3]  Guandong Xu,et al.  An Integrated Pruning Criterion for Ensemble Learning Based on Classification Accuracy and Diversity , 2012, KMO.

[4]  Uwe Fink,et al.  Classic Works Of The Dempster Shafer Theory Of Belief Functions , 2016 .

[5]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[6]  Achim Rettinger,et al.  Mining the Semantic Web , 2012, Data Mining and Knowledge Discovery.

[7]  Nicola Fanizzi,et al.  Tackling the Class-Imbalance Learning Problem in Semantic Web Knowledge Bases , 2014, EKAW.

[8]  Saso Dzeroski,et al.  First order random forests: Learning relational classifiers with complex aggregates , 2006, Machine Learning.

[9]  D. Dubois,et al.  On the Combination of Evidence in Various Mathematical Frameworks , 1992 .

[10]  Georg Gottlob,et al.  Ontology-based semantic search on the Web and its combination with the power of inductive reasoning , 2012, Annals of Mathematics and Artificial Intelligence.

[11]  Nicola Fanizzi,et al.  Induction of Concepts in Web Ontologies through Terminological Decision Trees , 2010, ECML/PKDD.

[12]  Tim Berners-Lee,et al.  Linked data , 2020, Semantic Web for the Working Ontologist.

[13]  Yaxin Bi,et al.  The combination of multiple classifiers using an evidential reasoning approach , 2008, Artif. Intell..

[14]  Ronald R. Yager,et al.  Classic Works of the Dempster-Shafer Theory of Belief Functions , 2010, Classic Works of the Dempster-Shafer Theory of Belief Functions.

[15]  Nicola Fanizzi,et al.  Towards Evidence-Based Terminological Decision Trees , 2014, IPMU.

[16]  Galina L. Rogova,et al.  Combining the results of several neural network classifiers , 1994, Neural Networks.

[17]  Kari Sentz,et al.  Combination of Evidence in Dempster-Shafer Theory , 2002 .

[18]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[19]  Jean Dezert,et al.  An Introduction to the DSm Theory for the Combination of Paradoxical, Uncertain, and Imprecise Sources of Information , 2006, ArXiv.

[20]  Yunqian Ma,et al.  Imbalanced Learning: Foundations, Algorithms, and Applications , 2013 .

[21]  I KunchevaLudmila A Theoretical Study on Six Classifier Fusion Strategies , 2002 .

[22]  Chun Yang,et al.  Learning to Diversify via Weighted Kernels for Classifier Ensemble , 2014, ArXiv.

[23]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[24]  George J. Klir,et al.  Uncertainty and Information: Emergence of Vast New Territories , 2006 .

[25]  Ludmila I. Kuncheva,et al.  A Theoretical Study on Six Classifier Fusion Strategies , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Tom Heath,et al.  Linked Data: Evolving the Web into a Global Data Space , 2011, Linked Data.

[27]  Edward Beltrami,et al.  Uncertainty and Information , 2020, What Is Random?.