Online active learning of decision trees with evidential data

Learning from uncertain data has attracted increasing attention in recent years. In this paper, we propose a decision tree learning method that can not only handle uncertain data, but also reduce epistemic uncertainty by querying the most valuable uncertain instances within the learning procedure. Specifically, we use entropy intervals extracted from the evidential likelihood to query uncertain training instances when needed, with the goal to improve the selection of the splitting attribute. Experimental results under various conditions confirm the interest of the proposed approach. HighlightsActive belief decision trees are learned from uncertain data modelled by belief functions.A query strategy is proposed to query the most valuable uncertain instances while learning decision trees.To deal with evidential data, entropy intervals are extracted from the evidential likelihood.Experiments with UCI data illustrate the robustness of proposed approach to various kinds of uncertain data.

[1]  Witold Pedrycz,et al.  Fuzzy rule based decision trees , 2015, Pattern Recognit..

[2]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[3]  Thierry Denœux Maximum Likelihood from Evidential Data: An Extension of the EM Algorithm , 2010 .

[4]  J. Koenderink Q… , 2014, Les noms officiels des communes de Wallonie, de Bruxelles-Capitale et de la communaute germanophone.

[5]  Sotiris B. Kotsiantis,et al.  Decision trees: a recent overview , 2011, Artificial Intelligence Review.

[6]  J. R. Quinlan DECISION TREES AS PROBABILISTIC CLASSIFIERS , 1987 .

[7]  Thierry Denoeux,et al.  Learning Decision Trees from Uncertain Data with an Evidential EM Approach , 2013, 2013 12th International Conference on Machine Learning and Applications.

[8]  Geoffrey Gregory,et al.  Foundations of Statistical Inference , 1973 .

[9]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[10]  Thomas G. Dietterich,et al.  A Conditional Multinomial Mixture Model for Superset Label Learning , 2012, NIPS.

[11]  M. Shaw,et al.  Induction of fuzzy decision trees , 1995 .

[12]  Eyke Hüllermeier,et al.  Learning from imprecise and fuzzy observations: Data disambiguation through generalized loss minimization , 2013, Int. J. Approx. Reason..

[13]  Quan Pan,et al.  Credal classification rule for uncertain data based on belief functions , 2014, Pattern Recognit..

[14]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[15]  Khaled Mellouli,et al.  Pruning belief decision tree methods in averaging and conjunctive approaches , 2007, Int. J. Approx. Reason..

[16]  Biao Qin,et al.  DTU: A Decision Tree for Uncertain Data , 2009, PAKDD.

[17]  Arthur P. Dempster,et al.  Upper and Lower Probabilities Induced by a Multivalued Mapping , 1967, Classic Works of the Dempster-Shafer Theory of Belief Functions.

[18]  Jason Weston,et al.  Fast Kernel Classifiers with Online and Active Learning , 2005, J. Mach. Learn. Res..

[19]  Inés Couso,et al.  Harnessing the information contained in low-quality data sources , 2014, Int. J. Approx. Reason..

[20]  Thomas G. Dietterich,et al.  Learnability of the Superset Label Learning Problem , 2014, ICML.

[21]  Khaled Mellouli,et al.  Belief decision trees: theoretical foundations , 2001, Int. J. Approx. Reason..

[22]  P. Vannoorenberghe,et al.  Handling uncertain labels in multiclass problems using belief decision trees , 2002 .

[23]  Sau Dan Lee,et al.  Decision Trees for Uncertain Data , 2011, IEEE Transactions on Knowledge and Data Engineering.

[24]  Thierry Denoeux,et al.  Training and Evaluating Classifiers from Evidential Data: Application to E 2 M Decision Tree Pruning , 2014, Belief Functions.

[25]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[26]  Mehryar Mohri,et al.  Learning from Uncertain Data , 2003, COLT.

[27]  Thomas Reineking,et al.  Evidential Object Recognition Based on Information Gain Maximization , 2014, Belief Functions.

[28]  Thierry Denoeux,et al.  An evidential classifier based on feature selection and two-step classification strategy , 2015, Pattern Recognit..

[29]  Patrick Vannoorenberghe,et al.  On aggregating belief decision trees , 2004, Inf. Fusion.

[30]  Philip S. Yu,et al.  A Survey of Uncertain Data Algorithms and Applications , 2009, IEEE Transactions on Knowledge and Data Engineering.

[31]  Marie Chavent,et al.  Handling Missing Values with Regularized Iterative Multiple Correspondence Analysis , 2011, Journal of Classification.

[32]  Thierry Denux,et al.  Likelihood-based belief function: Justification and some extensions to low-quality data , 2014, Int. J. Approx. Reason..

[33]  Thierry Denoeux,et al.  Maximum Likelihood Estimation from Uncertain Data in the Belief Function Framework , 2013, IEEE Transactions on Knowledge and Data Engineering.

[34]  P. Walley Statistical Reasoning with Imprecise Probabilities , 1990 .

[35]  Yang Zhang,et al.  Decision Tree for Dynamic and Uncertain Data Streams , 2010, ACML.