Multi-dimensional Bayesian network classifiers: A survey

Multi-dimensional classification is a cutting-edge problem, in which the values of multiple class variables have to be simultaneously assigned to a given example. It is an extension of the well known multi-label subproblem, in which the class variables are all binary. In this article, we review and expand the set of performance evaluation measures suitable for assessing multi-dimensional classifiers. We focus on multi-dimensional Bayesian network classifiers, which directly cope with multi-dimensional classification and consider dependencies among class variables. A comprehensive survey of this state-of-the-art classification model is offered by covering aspects related to their learning and inference process complexities. We also describe algorithms for structural learning, provide real-world applications where they have been used, and compile a collection of related software.

[1]  Constantin F. Aliferis,et al.  Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part II: Analysis and Extensions , 2010, J. Mach. Learn. Res..

[2]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[3]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[4]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[5]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[6]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[7]  Grigorios Tsoumakas,et al.  Dealing with Concept Drift and Class Imbalance in Multi-Label Stream Classification , 2011, IJCAI.

[8]  F. Harary New directions in the theory of graphs , 1973 .

[9]  Philip S. Yu,et al.  An ensemble-based approach to fast classification of multi-label data streams , 2011, 7th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom).

[10]  Thomas Stützle,et al.  Efficient Stochastic Local Search for MPE Solving , 2005, IJCAI.

[11]  Johan Kwisthout,et al.  Most probable explanations in Bayesian networks: Complexity and tractability , 2011, Int. J. Approx. Reason..

[12]  Eyke Hüllermeier,et al.  Multilabel classification via calibrated label ranking , 2008, Machine Learning.

[13]  Iñaki Inza,et al.  Multidimensional Learning from Crowds: Usefulness and Application of Expertise Detection , 2015, Int. J. Intell. Syst..

[14]  Xin Yao,et al.  The Impact of Diversity on Online Ensemble Learning in the Presence of Concept Drift , 2010, IEEE Transactions on Knowledge and Data Engineering.

[15]  Denis Deratani Mauá,et al.  Trading off Speed and Accuracy in Multilabel Classification , 2014, Probabilistic Graphical Models.

[16]  Glenn Fung,et al.  Automated Heart Wall Motion Abnormality Detection from Ultrasound Images Using Bayesian Networks , 2007, IJCAI.

[17]  Wray L. Buntine Theory Refinement on Bayesian Networks , 1991, UAI.

[18]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[19]  Sebastián Ventura,et al.  A Tutorial on Multilabel Learning , 2015, ACM Comput. Surv..

[20]  Concha Bielza,et al.  Multi-dimensional classification with Bayesian networks , 2011, Int. J. Approx. Reason..

[21]  José Antonio Lozano,et al.  Using Multidimensional Bayesian Network Classifiers to Assist the Treatment of Multiple Sclerosis , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[22]  Marvin Minsky,et al.  Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[23]  Bon K. Sy,et al.  Reasoning MPE to Multiply Connected Belief Networks Using Message Passing , 1992, AAAI.

[24]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[25]  Solomon Eyal Shimony,et al.  A new algorithm for finding MAP assignments to belief networks , 1990, UAI.

[26]  Nir Friedman,et al.  The Bayesian Structural EM Algorithm , 1998, UAI.

[27]  Grigorios Tsoumakas,et al.  MULAN: A Java Library for Multi-Label Learning , 2011, J. Mach. Learn. Res..

[28]  Geoff Holmes,et al.  Scalable and efficient multi-label classification for evolving data streams , 2012, Machine Learning.

[29]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[30]  Concha Bielza,et al.  Discrete Bayesian Network Classifiers , 2014, ACM Comput. Surv..

[31]  Iñaki Inza,et al.  Approaching Sentiment Analysis by using semi-supervised learning of multi-dimensional classifiers , 2012, Neurocomputing.

[32]  Concha Bielza,et al.  Multi-dimensional Bayesian Network Classifier Trees , 2018, IDEAL.

[33]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[34]  L MinkuLeandro,et al.  Ensemble learning for data stream analysis , 2017 .

[35]  Rina Dechter,et al.  Bucket Elimination: A Unifying Framework for Reasoning , 1999, Artif. Intell..

[36]  João Gama,et al.  Ensemble learning for data stream analysis: A survey , 2017, Inf. Fusion.

[37]  Denis Deratani Mauá,et al.  An Ensemble of Bayesian Networks for Multilabel Classification , 2013, IJCAI.

[38]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[39]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[40]  Rina Dechter,et al.  A general scheme for automatic generation of search heuristics from specification dependencies , 2001, Artif. Intell..

[41]  Rina Dechter,et al.  A Scheme for Approximating Probabilistic Inference , 1997, UAI.

[42]  Mehran Sahami,et al.  Learning Limited Dependence Bayesian Classifiers , 1996, KDD.

[43]  Fabio Stella,et al.  Continuous time Bayesian network classifiers , 2012, J. Biomed. Informatics.

[44]  Geoff Holmes,et al.  MEKA: A Multi-label/Multi-target Extension to WEKA , 2016, J. Mach. Learn. Res..

[45]  Eibe Frank,et al.  A Simple Approach to Ordinal Classification , 2001, ECML.

[46]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[47]  Zhaoyu Li,et al.  An efficient approach for finding the MPE in belief networks , 1993, UAI.

[48]  Solomon Eyal Shimony,et al.  Finding MAPs for Belief Networks is NP-Hard , 1994, Artif. Intell..

[49]  Johannes Fürnkranz,et al.  Multi-Label Classification with Label Constraints , 2008 .

[50]  José Antonio Lozano,et al.  Multi-Objective Learning of Multi-Dimensional Bayesian Classifiers , 2008, 2008 Eighth International Conference on Hybrid Intelligent Systems.

[51]  A. P. Dawid,et al.  Applications of a general propagation algorithm for probabilistic expert systems , 1992 .

[52]  Sanyang Liu,et al.  A hybrid method for learning multi-dimensional Bayesian network classifiers based on an optimization model , 2015, Applied Intelligence.

[53]  EVA GIBAJA,et al.  A Tutorial on Multi-Label Learning , 2014 .

[54]  Remco R. Bouckaert,et al.  Optimizing Causal Orderings for Generating DAGs from Data , 1992, UAI.

[55]  Francisco Charte,et al.  Working with Multilabel Datasets in R: The mldr Package , 2015, R J..

[56]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[57]  Mark A. Kramer,et al.  GALGO: A Genetic ALGOrithm Decision Support Tool for Complex Uncertain Systems Modeled with Bayesian Belief Networks , 1993, UAI.

[58]  Grigorios Tsoumakas,et al.  On the Stratification of Multi-label Data , 2011, ECML/PKDD.

[59]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[60]  Pedro Larrañaga,et al.  Feature selection in Bayesian classifiers for the prognosis of survival of cirrhotic patients treated with TIPS , 2005, J. Biomed. Informatics.

[61]  Concha Bielza,et al.  Multi-Dimensional Classification with Super-Classes , 2014, IEEE Transactions on Knowledge and Data Engineering.

[62]  Derek G. Corneil,et al.  Complexity of finding embeddings in a k -tree , 1987 .

[63]  Yihong Gong,et al.  Multi-labelled classification using maximum entropy method , 2005, SIGIR '05.

[64]  Hong Shen,et al.  Weighted Ensemble Classification of Multi-label Data Streams , 2017, PAKDD.

[65]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[66]  Concha Bielza,et al.  Mining multi-dimensional concept-drifting data streams using Bayesian network classifiers , 2016, Intell. Data Anal..

[67]  Concha Bielza,et al.  Predicting human immunodeficiency virus inhibitors using multi-dimensional Bayesian network classifiers , 2013, Artif. Intell. Medicine.

[68]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[69]  Keiji Kanazawa,et al.  A model for reasoning about persistence and causation , 1989 .

[70]  Rina Dechter,et al.  Mini-Bucket Heuristics for Improved Search , 1999, UAI.

[71]  Yunming Ye,et al.  A new ensemble method for multi-label data stream classification in non-stationary environment , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[72]  Ashraf M. Abdelbar,et al.  Approximating MAPs for Belief Networks is NP-Hard and Other Theorems , 1998, Artif. Intell..

[73]  Luis Enrique Sucar,et al.  A Two-Step Method to Learn Multidimensional Bayesian Network Classifiers Based on Mutual Information Measures , 2011, FLAIRS.

[74]  Pedro M. Domingos,et al.  Tree Induction for Probability-Based Ranking , 2003, Machine Learning.

[75]  Linda C. van der Gaag,et al.  Inference and Learning in Multi-dimensional Bayesian Network Classifiers , 2007, ECSQARU.

[76]  Víctor Robles,et al.  Feature selection for multi-label naive Bayes classification , 2009, Inf. Sci..

[77]  Judea Pearl,et al.  The recovery of causal poly-trees from statistical data , 1987, Int. J. Approx. Reason..

[78]  D. Hinkley Inference about the change-point from cumulative sum tests , 1971 .

[79]  Youlong Yang,et al.  Decision function with probability feature weighting based on Bayesian network for multi-label classification , 2019, Neural Computing and Applications.

[80]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[81]  Eyke Hüllermeier,et al.  Decision tree and instance-based learning for label ranking , 2009, ICML '09.

[82]  Eyke Hüllermeier,et al.  Label ranking by learning pairwise preferences , 2008, Artif. Intell..

[83]  Janneke H. Bolt,et al.  Balanced sensitivity functions for tuning multi-dimensional Bayesian network classifiers , 2017, Int. J. Approx. Reason..

[84]  Alan Bundy,et al.  Symbolic and Quantitative Approaches to Reasoning and Uncertainty , 1993 .

[85]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[86]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[87]  Concha Bielza,et al.  Multi-label classification with Bayesian network-based chain classifiers , 2014, Pattern Recognit. Lett..

[88]  Adriano Rivolli,et al.  The utiml Package: Multi-label Classification in R , 2018, R J..

[89]  María Concepción Bielza Lozoya,et al.  Multidimensional classifiers for neuroanatomical data , 2015, ICML 2015.

[90]  Daphne Koller,et al.  Continuous Time Bayesian Networks , 2012, UAI.

[91]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[92]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[93]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[94]  Concha Bielza,et al.  Markov blanket-based approach for learning multi-dimensional Bayesian network classifiers: An application to predict the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson's Disease Questionnaire (PDQ-39) , 2012, J. Biomed. Informatics.

[95]  Yiming Yang,et al.  An Evaluation of Statistical Approaches to Text Categorization , 1999, Information Retrieval.

[96]  Arnoud Pastink,et al.  Multi-classifiers of Small Treewidth , 2015, ECSQARU.

[97]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[98]  Edzard S. Gelsema,et al.  Abductive reasoning in Bayesian belief networks using a genetic algorithm , 1995, Pattern Recognit. Lett..

[99]  Francisco Charte,et al.  Tips, guidelines and tools for managing multi-label datasets: the mldr.datasets R package and the Cometa data repository , 2018, Neurocomputing.

[100]  Grigorios Tsoumakas,et al.  Random k -Labelsets: An Ensemble Method for Multilabel Classification , 2007, ECML.

[101]  D.-J. Guan,et al.  GENERALIZED GRAY CODES WITH APPLICATIONS , 1998 .

[102]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[103]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[104]  J. Kruskal On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .

[105]  Concha Bielza,et al.  International Journal of Approximate Reasoning Tractability of most probable explanations in multidimensional Bayesian network classifiers ✩ , 2022 .

[106]  Tomasz Kajdanowicz,et al.  A scikit-based Python environment for performing multi-label classification , 2017, ArXiv.

[107]  Max Henrion,et al.  Propagating uncertainty in bayesian networks by probabilistic logic sampling , 1986, UAI.

[108]  Daphne Koller,et al.  Ordering-Based Search: A Simple and Effective Algorithm for Learning Bayesian Networks , 2005, UAI.

[109]  Eugene Santos,et al.  On the Generation of Alternative Explanations with Implications for Belief Revision , 1991, UAI.

[110]  Johannes Fürnkranz,et al.  An Evaluation of Efficient Multilabel Classification Algorithms for Large-Scale Problems in the Legal Domain , 2007, LWA.

[111]  João Gama,et al.  Learning with Local Drift Detection , 2006, ADMA.

[112]  Nir Friedman,et al.  Learning Belief Networks in the Presence of Missing Values and Hidden Variables , 1997, ICML.

[113]  Iñaki Inza,et al.  Supervised pre-processing approaches in multiple class variables classification for fish recruitment forecasting , 2013, Environ. Model. Softw..

[114]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[115]  Jose Miguel Puerta,et al.  A scalable pairwise class interaction framework for multidimensional classification , 2016, Int. J. Approx. Reason..

[116]  Yang Zhang,et al.  Mining Multi-label Concept-Drifting Data Streams Using Dynamic Classifier Ensemble , 2009, ACML.

[117]  Evgueni A. Haroutunian,et al.  Information Theory and Statistics , 2011, International Encyclopedia of Statistical Science.

[118]  Rina Dechter,et al.  AND/OR Branch-and-Bound search for combinatorial optimization in graphical models , 2009, Artif. Intell..

[119]  Improving Probability Estimation Trees for , 2022 .

[120]  Constantin F. Aliferis,et al.  Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical Evaluation , 2010, J. Mach. Learn. Res..

[121]  Pedro Larrañaga,et al.  Bayesian classifiers based on kernel density estimation: Flexible classifiers , 2009, Int. J. Approx. Reason..

[122]  Concha Bielza,et al.  Bayesian Chain Classifiers for Multidimensional Classification , 2011, IJCAI.

[123]  Sunita Sarawagi,et al.  Discriminative Methods for Multi-labeled Classification , 2004, PAKDD.

[124]  Linda C. van der Gaag,et al.  Multi-dimensional Bayesian Network Classifiers , 2006, Probabilistic Graphical Models.

[125]  Luis Enrique Sucar,et al.  Circular Chain Classifiers , 2018, PGM.

[126]  Li Guo,et al.  Mining Multi-Label Data Streams Using Ensemble-Based Active Learning , 2012, SDM.

[127]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[128]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[129]  Jesse Read,et al.  A Pruned Problem Transformation Method for Multi-label Classification , 2008 .