Discrete Bayesian Network Classifiers

We have had to wait over 30 years since the naive Bayes model was first introduced in 1960 for the so-called Bayesian network classifiers to resurge. Based on Bayesian networks, these classifiers have many strengths, like model interpretability, accommodation to complex data and classification problem settings, existence of efficient algorithms for learning and classification tasks, and successful applicability in real-world problems. In this article, we survey the whole set of discrete Bayesian network classifiers devised to date, organized in increasing order of structure complexity: naive Bayes, selective naive Bayes, seminaive Bayes, one-dependence Bayesian classifiers, k-dependence Bayesian classifiers, Bayesian network-augmented naive Bayes, Markov blanket-based Bayesian classifier, unrestricted Bayesian classifiers, and Bayesian multinets. Issues of feature subset selection and generative and discriminative structure and parameter learning are also covered.

[1]  Fengzhan Tian,et al.  A Discriminative Learning Method of TAN Classifier , 2007, ECSQARU.

[2]  Dirk Van den Poel,et al.  Random Multiclass Classification: Generalizing Random Forests to Random MNL and Random NB , 2007, DEXA.

[3]  Geoffrey I. Webb,et al.  Efficient lazy elimination for averaged one-dependence estimators , 2006, ICML.

[4]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[5]  Estevam R. Hruschka,et al.  Bayesian network classifiers: Beyond classification accuracy , 2011, Intell. Data Anal..

[6]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[7]  Estevam R. Hruschka,et al.  Towards efficient variables ordering for Bayesian networks classifier , 2007, Data Knowl. Eng..

[8]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[9]  Pedro Larrañaga,et al.  Discriminative Learning of Bayesian Network Classifiers via the TM Algorithm , 2005, ECSQARU.

[10]  Pedro Larrañaga,et al.  Feature selection in Bayesian classifiers for the prognosis of survival of cirrhotic patients treated with TIPS , 2005, J. Biomed. Informatics.

[11]  Terran Lane,et al.  Learning class-discriminative dynamic Bayesian networks , 2005, ICML.

[12]  Marie-France Sagot,et al.  Efficient Learning of Bayesian Network Classifiers , 2007, Australian Conference on Artificial Intelligence.

[13]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[14]  Bin Shen,et al.  Structural Extension to Logistic Regression: Discriminative Parameter Learning of Belief Net Classifiers , 2002, Machine Learning.

[15]  Pedro Larrañaga,et al.  Using Bayesian networks in the construction of a bi-level multi-classifier. A case study using intensive care unit patients data , 2001, Artif. Intell. Medicine.

[16]  Sebastian Tschiatschek,et al.  Maximum Margin Bayesian Network Classifiers , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[18]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[19]  Harry Zhang,et al.  Naive Bayes for optimal ranking , 2008, J. Exp. Theor. Artif. Intell..

[20]  Tzu-Tsung Wong Alternative prior assumptions for improving the performance of naïve Bayesian classifiers , 2008, Data Mining and Knowledge Discovery.

[21]  Martin Mozina,et al.  Nomograms for Visualization of Naive Bayesian Classifier , 2004, PKDD.

[22]  Tony Jebara,et al.  Machine learning: Discriminative and generative , 2006 .

[23]  Basilio Sierra,et al.  Histogram distance-based Bayesian Network structure learning: A supervised classification specific approach , 2009, Decis. Support Syst..

[24]  Jeff A. Bilmes,et al.  A Submodular-supermodular Procedure with Applications to Discriminative Structure Learning , 2005, UAI.

[25]  Mehran Sahami,et al.  Learning Limited Dependence Bayesian Classifiers , 1996, KDD.

[26]  G. Niklas Norén,et al.  Case Based Imprecision Estimates for Bayes Classifiers with the Bayesian Bootstrap , 2005, Machine Learning.

[27]  Robert E. Tarjan,et al.  Fibonacci heaps and their uses in improved network optimization algorithms , 1984, JACM.

[28]  Peter J. F. Lucas,et al.  Employing Maximum Mutual Information for Bayesian Classification , 2004, ISBMDA.

[29]  Michael G. Madden A New Bayesian Network Structure for Classification Tasks , 2002, AICS.

[30]  Gregory F. Cooper,et al.  Model Averaging for Prediction with Discrete Bayesian Networks , 2004, J. Mach. Learn. Res..

[31]  Juan José Rodríguez Diez,et al.  Naïve Bayes Ensembles with a Random Oracle , 2007, MCS.

[32]  Constantin F. Aliferis,et al.  Algorithms for Large Scale Markov Blanket Discovery , 2003, FLAIRS.

[33]  Russell Greiner,et al.  Discriminative Model Selection for Belief Net Structures , 2005, AAAI.

[34]  Xiaoyi Jiang,et al.  Structure identification of Bayesian classifiers based on GMDH , 2009, Knowl. Based Syst..

[35]  Eyke Hüllermeier,et al.  On Pairwise Naive Bayes Classifiers , 2007, ECML.

[36]  Liangxiao Jiang,et al.  Weighted average of one-dependence estimators† , 2012, J. Exp. Theor. Artif. Intell..

[37]  Moninder Singh,et al.  Construction of Bayesian network structures from data: A brief survey and an efficient algorithm , 1995, Int. J. Approx. Reason..

[38]  Jose Miguel Puerta,et al.  HODE: Hidden One-Dependence Estimator , 2009, ECSQARU.

[39]  Constantin F. Aliferis,et al.  HITON: A Novel Markov Blanket Algorithm for Optimal Variable Selection , 2003, AMIA.

[40]  Geoffrey I. Webb,et al.  Not So Naive Bayes: Aggregating One-Dependence Estimators , 2005, Machine Learning.

[41]  Pedro Larrañaga,et al.  Predicting survival in malignant skin melanoma using Bayesian networks automatically induced by genetic algorithms. An empirical comparison between different approaches , 1998, Artif. Intell. Medicine.

[42]  Qiang Yang,et al.  Test-cost sensitive naive Bayes classification , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[43]  Franz Pernkopf,et al.  Stochastic margin-based structure learning of Bayesian network classifiers , 2013, Pattern Recognit..

[44]  Peter J. F. Lucas,et al.  Restricted Bayesian Network Structure Learning , 2002, Probabilistic Graphical Models.

[45]  D. Titterington,et al.  Comparison of Discrimination Techniques Applied to a Complex Data Set of Head Injured Patients , 1981 .

[46]  Sebastian Thrun,et al.  Bayesian Network Induction via Local Neighborhoods , 1999, NIPS.

[47]  Constantin F. Aliferis,et al.  Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical Evaluation , 2010, J. Mach. Learn. Res..

[48]  Thomas Richardson,et al.  Interpretable Boosted Naïve Bayes Classification , 1998, KDD.

[49]  Franz Pernkopf,et al.  Efficient Heuristics for Discriminative Structure Learning of Bayesian Network Classifiers , 2010, J. Mach. Learn. Res..

[50]  Russell Greiner,et al.  Budgeted Learning of Naive-Bayes Classifiers , 2003, UAI.

[51]  Joseph F. Murray,et al.  Machine Learning Methods for Predicting Failures in Hard Drives: A Multiple-Instance Application , 2005, J. Mach. Learn. Res..

[52]  Irving John Good,et al.  The Estimation of Probabilities: An Essay on Modern Bayesian Methods , 1965 .

[53]  Eugene Santos,et al.  Exploring Case-Based Bayesian Networks and Bayesian Multi-nets for Classification , 2004, Canadian Conference on AI.

[54]  Concha Bielza,et al.  Forward stagewise naïve Bayes , 2011, Progress in Artificial Intelligence.

[55]  Byoung-Tak Zhang,et al.  Bayesian model averaging of Bayesian network classifiers over multiple node-orders: application to sparse datasets , 2005, IEEE Trans. Syst. Man Cybern. Part B.

[56]  A. J. Feelders,et al.  Discriminative Scoring of Bayesian Network Classifiers: a Comparative Study , 2006, Probabilistic Graphical Models.

[57]  Kaizhu Huang,et al.  Discriminative training of Bayesian Chow-Liu multinet classifiers , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[58]  Dunja Mladenic,et al.  Feature Selection for Unbalanced Class Distribution and Naive Bayes , 1999, ICML.

[59]  Gregory M. Provan,et al.  Efficient Learning of Selective Bayesian Network Classifiers , 1996, ICML.

[60]  Geoffrey I. Webb,et al.  Ensemble Selection for SuperParent-One-Dependence Estimators , 2005, Australian Conference on Artificial Intelligence.

[61]  Nir Friedman,et al.  Learning Belief Networks in the Presence of Missing Values and Hidden Variables , 1997, ICML.

[62]  David Heckerman,et al.  Knowledge Representation and Inference in Similarity Networks and Bayesian Multinets , 1996, Artif. Intell..

[63]  Concha Bielza,et al.  Cost-sensitive selective naive Bayes classifiers for predicting the increase of the h-index for scientific journals , 2014, Neurocomputing.

[64]  Luis M. de Campos,et al.  Learning Bayesian Network Classifiers: Searching in a Space of Partially Directed Acyclic Graphs , 2005, Machine Learning.

[65]  Constantin F. Aliferis,et al.  Towards Principled Feature Selection: Relevancy, Filters and Wrappers , 2003 .

[66]  A. P. Dawid,et al.  Generative or Discriminative? Getting the Best of Both Worlds , 2007 .

[67]  Enrico Fagiuoli,et al.  Tree-Based Credal Networks for Classification , 2003, Reliab. Comput..

[68]  Chun-Nan Hsu,et al.  Bayesian classification for data from the same unknown class , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[69]  Geoffrey I. Webb,et al.  Lazy Learning of Bayesian Rules , 2000, Machine Learning.

[70]  Franz Pernkopf,et al.  Floating search algorithm for structure learning of Bayesian network classifiers , 2003, Pattern Recognit. Lett..

[71]  Mohak Shah,et al.  Evaluating Learning Algorithms: A Classification Perspective , 2011 .

[72]  Zijian Zheng,et al.  Naive Bayesian Classifier Committees , 1998, ECML.

[73]  Paola Sebastiani,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. Robust Learning with Missing Data , 2022 .

[74]  Andrés R. Masegosa,et al.  Methods to Determine the Branching Attribute in Bayesian Multinets Classifiers , 2005, ECSQARU.

[75]  Ron Kohavi,et al.  Improving simple Bayes , 1997 .

[76]  Bernhard Pfahringer,et al.  Locally Weighted Naive Bayes , 2002, UAI.

[77]  Hong-Bo Shi,et al.  Tree-augmented naive Bayes ensembles , 2004, Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826).

[78]  Harry Zhang,et al.  Learning weighted naive Bayes with accurate ranking , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[79]  Marvin Minsky,et al.  Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[80]  Dale Schuurmans,et al.  Maximum Margin Bayesian Networks , 2005, UAI.

[81]  Duc Truong Pham,et al.  Building Bayesian network classifiers through a Bayesian complexity monitoring system , 2009 .

[82]  Liangxiao Jiang,et al.  Lazy Averaged One-Dependence Estimators , 2006, Canadian Conference on AI.

[83]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[84]  Mark A. Hall,et al.  A decision tree-based attribute weighting filter for naive Bayes , 2006, Knowl. Based Syst..

[85]  Michael G. Madden,et al.  On the classification performance of TAN and general Bayesian networks , 2008, Knowl. Based Syst..

[86]  Timo Koski,et al.  Bounds for the Loss in Probability of Correct Classification Under Model Based Approximation , 2006, J. Mach. Learn. Res..

[87]  Michael J. Pazzani,et al.  Learning and Revising User Profiles: The Identification of Interesting Web Sites , 1997, Machine Learning.

[88]  M. E. Maron,et al.  On Relevance, Probabilistic Indexing and Information Retrieval , 1960, JACM.

[89]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[90]  Anind K. Dey,et al.  Learning Selectively Conditioned Forest Structures with Applications to DBNs and Classification , 2007, UAI.

[91]  共立出版株式会社 コンピュータ・サイエンス : ACM computing surveys , 1978 .

[92]  Duane Szafron,et al.  Visual Explanation of Evidence with Additive Classifiers , 2006, AAAI.

[93]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[94]  Dimitrios Gunopulos,et al.  Feature selection for the naive bayesian classifier using decision trees , 2003, Appl. Artif. Intell..

[95]  Liangxiao Jiang,et al.  A Novel Bayes Model: Hidden Naive Bayes , 2009, IEEE Transactions on Knowledge and Data Engineering.

[96]  María S. Pérez-Hernández,et al.  Interval Estimation Naïve Bayes , 2003, IDA.

[97]  Nir Friedman,et al.  Data Analysis with Bayesian Networks: A Bootstrap Approach , 1999, UAI.

[98]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[99]  David Madigan,et al.  On the Naive Bayes Model for Text Categorization , 2003, AISTATS.

[100]  Pedro Larrañaga,et al.  Feature Subset Selection by Bayesian network-based optimization , 2000, Artif. Intell..

[101]  María S. Pérez-Hernández,et al.  Learning Semi Naïve Bayes Structures by Estimation of Distribution Algorithms , 2003, EPIA.

[102]  Franz Pernkopf,et al.  Discriminative versus generative parameter and structure learning of Bayesian network classifiers , 2005, ICML.

[103]  Tharam S. Dillon,et al.  An improved naive Bayesian classifier technique coupled with a novel input solution method [rainfall prediction] , 2001, IEEE Trans. Syst. Man Cybern. Syst..

[104]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[105]  Judea Pearl,et al.  Equivalence and Synthesis of Causal Models , 1990, UAI.

[106]  Henry Tirri,et al.  On Discriminative Bayesian Network Classifiers and Logistic Regression , 2005, Machine Learning.

[107]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[108]  Rajat Raina,et al.  Classification with Hybrid Generative/Discriminative Models , 2003, NIPS.

[109]  Russell Greiner,et al.  Learning Bayesian Belief Network Classifiers: Algorithms and System , 2001, Canadian Conference on AI.

[110]  David Maxwell Chickering,et al.  A Transformational Characterization of Equivalent Bayesian Network Structures , 1995, UAI.

[111]  Gregory M. Provan,et al.  Learning Bayesian Networks Using Feature Selection , 1995, AISTATS.

[112]  Bojan Cestnik,et al.  Estimating Probabilities: A Crucial Task in Machine Learning , 1990, ECAI.

[113]  Mieczyslaw A. Klopotek,et al.  Very large Bayesian multinets for text classification , 2005, Future Gener. Comput. Syst..

[114]  Dimitris Margaritis,et al.  Speculative Markov blanket discovery for optimal feature selection , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[115]  Duncan Fyfe Gillies,et al.  Using Hidden Nodes in Bayesian Networks , 1996, Artif. Intell..

[116]  Ana M. Martínez,et al.  Supervised Classification with Bayesian Networks: A Review on Models and Applications , 2012 .

[117]  Pedro Larrañaga,et al.  Filter versus wrapper gene selection approaches in DNA microarray domains , 2004, Artif. Intell. Medicine.

[118]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[119]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[120]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[121]  G. Pask,et al.  Heuristic Self-Organization in Problems of Engineering Cybernetics , 2003 .

[122]  Paola Sebastiani,et al.  Robust Bayes classifiers , 2001, Artif. Intell..

[123]  Andrés R. Masegosa,et al.  A Semi-naive Bayes Classifier with Grouping of Cases , 2007, ECSQARU.

[124]  Rema Padman,et al.  Tabu Search-Enhanced Graphical Models for Classification in High Dimensions , 2008, INFORMS J. Comput..

[125]  Alex Aussem,et al.  A novel Markov boundary based feature subset selection algorithm , 2010, Neurocomputing.

[126]  María S. Pérez-Hernández,et al.  Bayesian network multi-classifiers for protein secondary structure prediction , 2004, Artif. Intell. Medicine.

[127]  D. Hand,et al.  Idiot's Bayes—Not So Stupid After All? , 2001 .

[128]  Henry Tirri,et al.  BAYDA: Software for Bayesian Classification and Feature Selection , 1998, KDD.

[129]  Ramón López de Mántaras,et al.  Robust Bayesian Linear Classifier Ensembles , 2005, ECML.

[130]  Shunkai Fu,et al.  Local Learning Algorithm for Markov Blanket Discovery , 2007, Australian Conference on Artificial Intelligence.

[131]  Eamonn J. Keogh,et al.  Learning the Structure of Augmented Bayesian Classifiers , 2002, Int. J. Artif. Intell. Tools.

[132]  Thomas D. Nielsen,et al.  Latent variable discovery in classification models , 2004, Artif. Intell. Medicine.

[133]  Boaz Lerner,et al.  Bayesian Class-Matched Multinet Classifier , 2006, SSPR/SPR.

[134]  Tao Wang,et al.  Generalized Additive Bayesian Network Classifiers , 2007, IJCAI.

[135]  Franz Pernkopf,et al.  Bayesian network classifiers versus selective k-NN classifier , 2005, Pattern Recognit..

[136]  João Gama,et al.  Iterative Bayes , 2000, Intell. Data Anal..

[137]  Marco Zaffalon The naive credal classifier , 2002 .

[138]  Alan Agresti,et al.  Categorical Data Analysis , 2003 .

[139]  Nada Lavrac,et al.  The Multi-Purpose Incremental Learning System AQ15 and Its Testing Application to Three Medical Domains , 1986, AAAI.

[140]  Franz Pernkopf,et al.  On Discriminative Parameter Learning of Bayesian Network Classifiers , 2009, ECML/PKDD.

[141]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[142]  Teemu Roos,et al.  Discriminative Learning of Bayesian Networks via Factorized Conditional Log-Likelihood , 2011, J. Mach. Learn. Res..

[143]  Geoffrey I. Webb,et al.  To Select or To Weigh: A Comparative Study of Linear Combination Schemes for SuperParent-One-Dependence Estimators , 2007, IEEE Transactions on Knowledge and Data Engineering.

[144]  Joaquín Abellán Application of uncertainty measures on credal sets on the naive Bayesian classifier , 2006, Int. J. Gen. Syst..

[145]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[146]  Silja Renooij,et al.  Evidence and Scenario Sensitivities in Naive Bayesian Classifiers , 2006, Probabilistic Graphical Models.

[147]  Marcel Worring,et al.  Face detection by aggregated Bayesian network classifiers , 2001, Pattern Recognit. Lett..

[148]  Constantin F. Aliferis,et al.  Time and sample efficient discovery of Markov blankets and direct causal relations , 2003, KDD '03.

[149]  William J. McGill Multivariate information transmission , 1954, Trans. IRE Prof. Group Inf. Theory.

[150]  Pedro M. Domingos,et al.  Learning Bayesian network classifiers by maximizing conditional likelihood , 2004, ICML.

[151]  Gregory F. Cooper,et al.  A Bayesian Network Classifier that Combines a Finite Mixture Model and a NaIve Bayes Model , 1999, UAI.

[152]  Jin Tian,et al.  A Hybrid Generative/Discriminative Bayesian Classifier , 2006, FLAIRS Conference.

[153]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[154]  Russell Greiner,et al.  Budgeted learning of nailve-bayes classifiers , 2002, UAI 2002.

[155]  Peter J. Cameron,et al.  Rank three permutation groups with rank three subconstituents , 1985, J. Comb. Theory, Ser. B.

[156]  Vladimir Pavlovic,et al.  Boosted Bayesian network classifiers , 2008, Machine Learning.

[157]  David Maxwell Chickering,et al.  Large-Sample Learning of Bayesian Networks is NP-Hard , 2002, J. Mach. Learn. Res..

[158]  Ramón López de Mántaras,et al.  TAN Classifiers Based on Decomposable Distributions , 2005, Machine Learning.

[159]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[160]  Olivier François,et al.  Learning the Tree Augmented Naive Bayes Classifier from incomplete datasets , 2006, Probabilistic Graphical Models.

[161]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[162]  Stan Matwin,et al.  Discriminative parameter learning for Bayesian networks , 2008, ICML '08.

[163]  Jesper Tegnér,et al.  Towards scalable and data efficient learning of Markov boundaries , 2007, Int. J. Approx. Reason..

[164]  Kazuo J. Ezawa,et al.  Constructing Bayesian Networks to Predict Uncollectible Telecommunications Accounts , 1996, IEEE Expert.

[165]  Geoffrey I. Webb,et al.  Adjusted Probability Naive Bayesian Induction , 1998, Australian Joint Conference on Artificial Intelligence.

[166]  Igor Kononenko,et al.  Successive Naive Bayesian Classifier , 1993, Informatica.

[167]  M. Pazzani Constructive Induction of Cartesian Product Attributes , 1998 .

[168]  Liangxiao Jiang,et al.  Improving Tree augmented Naive Bayes for class probability estimation , 2012, Knowl. Based Syst..

[169]  Jeff A. Bilmes,et al.  Dynamic Bayesian Multinets , 2000, UAI.

[170]  J. Kruskal On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .

[171]  Gregory F. Cooper,et al.  Exact model averaging with naive Bayesian classifiers , 2002, ICML.

[172]  Anderson Ara,et al.  Bagging k-dependence probabilistic networks: An alternative powerful fraud detection tool , 2012, Expert Syst. Appl..

[173]  Ricardo Vilalta,et al.  A Decomposition of Classes via Clustering to Explain and Improve Naive Bayes , 2003, ECML.

[174]  D. M. Titterington,et al.  Joint discriminative-generative modelling based on statistical tests for classification , 2010, Pattern Recognit. Lett..

[175]  Wray L. Buntine Theory Refinement on Bayesian Networks , 1991, UAI.

[176]  David G. Stork,et al.  Pattern Classification , 1973 .

[177]  Thomas D. Nielsen,et al.  Classification using Hierarchical Naïve Bayes models , 2006, Machine Learning.

[178]  Azuraliza Abu Bakar,et al.  Naïve bayes variants in classification learning , 2010, 2010 International Conference on Information Retrieval & Knowledge Management (CAMP).

[179]  Pat Langley,et al.  Induction of Recursive Bayesian Classifiers , 1993, ECML.

[180]  Marco Wiering,et al.  Feature selection for Bayesian network classifiers using the MDL-FS score , 2010, Int. J. Approx. Reason..

[181]  S. Lauritzen,et al.  The TM algorithm for maximising a conditional likelihood function , 2001 .

[182]  Naonori Ueda,et al.  A hybrid generative/discriminative approach to text classification with additional information , 2007, Inf. Process. Manag..