Analysis of Nasopharyngeal Carcinoma Data with a Novel Bayesian Network Learning Algorithm

Learning the structure of a Bayesian network from a data set is NP-hard. In this paper, we discuss a novel heuristic called polynomial max-min skeleton (PMMS) developed by Tsamardinos et al. in 2005. PMMS was proved by extensive empirical simulations to be an excellent trade-off between time and quality of reconstruction compared to all constraint based algorithms, especially for the smaller sample sizes. Unfortunately, there are two main problems with PMMS : it is unable to deal with missing data nor with datasets containing functional dependencies between variables. In this paper, we propose a way to overcome these problems. The new version of PMMS is first applied on standard benchmarks to recover the original structure from data. The algorithm is then applied on the nasopharyngeal carcinoma (NPC) made up from only 1289 uncomplete records in order to shed some light into the statistical profile of the population under study.

[1]  Constantin F. Aliferis,et al.  Algorithms for Large Scale Markov Blanket Discovery , 2003, FLAIRS.

[2]  Paola Sebastiani,et al.  c ○ 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. Robust Learning with Missing Data , 2022 .

[3]  David A. Bell,et al.  Learning Bayesian networks from data: An information-theory based approach , 2002, Artif. Intell..

[4]  Philippe Leray,et al.  BNT STRUCTURE LEARNING PACKAGE : Documentation and Experiments , 2004 .

[5]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[6]  Christopher Meek,et al.  Monotone DAG Faithfulness: A Bad Assumption , 2003 .

[7]  Marek J Druzdzel,et al.  Support of diagnosis of liver disorders based on a causal Bayesian network model. , 2001, Medical science monitor : international medical journal of experimental and clinical research.

[8]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[9]  Constantin F. Aliferis,et al.  A Comparison of Novel and State-of-the-Art Polynomial Bayesian Network Learning Algorithms , 2005, AAAI.

[10]  Nir Friedman,et al.  The Bayesian Structural EM Algorithm , 1998, UAI.

[11]  Marek J. Druzdzel,et al.  Robust Independence Testing for Constraint-Based Learning of Causal Structure , 2002, UAI.

[12]  Fabien De Marchi,et al.  Apprentissage de la structure des réseaux bayésiens à partir des motifs fréquents corrélés : application à l'identification des facteurs environnementaux du cancer du Nasopharynx , 2006, EGC.

[13]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[14]  Judea Pearl,et al.  Bayesian Networks , 1998, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[15]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[16]  Alex Aussem,et al.  Application des réseaux bayésiens à l'analyse des facteurs impliqués dans le cancer du Nasopharynx , 2007, EGC.

[17]  Marek J. Druzdzel,et al.  A Hybrid Anytime Algorithm for the Construction of Causal Models From Sparse Data , 1999, UAI.

[18]  Finn Verner Jensen,et al.  Introduction to Bayesian Networks , 2008, Innovations in Bayesian Networks.

[19]  Mustapha Lebbah,et al.  Approche connexionniste pour l'extraction de profils cas-témoins du cancer du Nasopharynx à partir des données issues d'une étude épidémiologique , 2007, EGC.

[20]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .