Multi-objective optimization to explicitly account for model complexity when learning Bayesian Networks

Bayesian Networks have been widely used in the last decades in many fields, to describe statistical dependencies among random variables. In general, learning the structure of such models is a problem with considerable theoretical interest that still poses many challenges. On the one hand, this is a well-known NP-complete problem, which is practically hardened by the huge search space of possible solutions. On the other hand, the phenomenon of I-equivalence, i.e., different graphical structures underpinning the same set of statistical dependencies, may lead to multimodal fitness landscapes further hindering maximum likelihood approaches to solve the task. Despite all these difficulties, greedy search methods based on a likelihood score coupled with a regularization term to account for model complexity, have been shown to be surprisingly effective in practice. In this paper, we consider the formulation of the task of learning the structure of Bayesian Networks as an optimization problem based on a likelihood score. Nevertheless, our approach do not adjust this score by means of any of the complexity terms proposed in the literature; instead, it accounts directly for the complexity of the discovered solutions by exploiting a multi-objective optimization procedure. To this extent, we adopt NSGA-II and define the first objective function to be the likelihood of a solution and the second to be the number of selected arcs. We thoroughly analyze the behavior of our method on a wide set of simulated data, and we discuss the performance considering the goodness of the inferred solutions both in terms of their objective functions and with respect to the retrieved structure. Our results show that NSGA-II can converge to solutions characterized by better likelihood and less arcs than classic approaches, although paradoxically frequently characterized by a lower similarity to the target network.

[1]  Göran Hallmans,et al.  Untangling the role of one-carbon metabolism in colorectal cancer risk: a comprehensive Bayesian network analysis , 2017, Scientific Reports.

[2]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[3]  Pedro Larrañaga,et al.  Structure Learning of Bayesian Networks by Genetic Algorithms , 1994 .

[4]  Bud Mishra,et al.  Causal data science for financial stress testing , 2017, J. Comput. Sci..

[5]  R. W. Robinson Counting unlabeled acyclic digraphs , 1977 .

[6]  Giancarlo Mauri,et al.  Parallel implementation of efficient search schemes for the inference of cancer progression models , 2016, 2016 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[7]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[8]  A. Riggs,et al.  Analysis of high-resolution 3D intrachromosomal interactions aided by Bayesian network modeling , 2017, Proceedings of the National Academy of Sciences.

[9]  David Maxwell Chickering,et al.  Learning Bayesian Networks is , 1994 .

[10]  Marco S. Nobile,et al.  Learning the Probabilistic Structure of Cumulative Phenomena with Suppes-Bayes Causal Networks , 2017, ICCS.

[11]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[12]  Giulio Caravagna,et al.  On learning the structure of Bayesian Networks and submodular function maximization , 2017, ArXiv.

[13]  P. Mair,et al.  Co-morbid obsessive–compulsive disorder and depression: a Bayesian network approach , 2017, Psychological Medicine.

[14]  C. Chen,et al.  Analysis of prognostic factors for survival after surgery for gallbladder cancer based on a Bayesian network , 2017, Scientific Reports.

[15]  Iya Khalil,et al.  Bayesian Network Inference Modeling Identifies TRIB1 as a Novel Regulator of Cell-Cycle Progression and Survival in Cancer Cells. , 2017, Cancer research.

[16]  T. Back Selective pressure in evolutionary algorithms: a characterization of selection mechanisms , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[17]  Giancarlo Mauri,et al.  Algorithmic methods to infer the evolutionary trajectories in cancer progression , 2015, Proceedings of the National Academy of Sciences.

[18]  Claudio De Stefano,et al.  A novel mutation operator for the evolutionary learning of Bayesian networks , 2008, 2017 IEEE Congress on Evolutionary Computation (CEC).

[19]  Daniele Ramazzotti,et al.  A quantitative assessment of the effect of different algorithmic schemes to the task of learning the structure of Bayesian Networks , 2017, ArXiv.

[20]  Jose Miguel Puerta,et al.  Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood , 2010, Data Mining and Knowledge Discovery.

[21]  Hitoshi Iba,et al.  A Bayesian Network Approach to Program Generation , 2008, IEEE Transactions on Evolutionary Computation.

[22]  H. Akaike A new look at the statistical model identification , 1974 .

[23]  Amir K. Foroushani,et al.  Applications of Bayesian network models in predicting types of hematological malignancies , 2018, Scientific Reports.

[24]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[25]  Hua Xu,et al.  An improved NSGA-III procedure for evolutionary many-objective optimization , 2014, GECCO.

[26]  Christine Solnon,et al.  Ant Colony Optimization for Multi-Objective Optimization Problems , 2007, 19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007).

[27]  Kaisa Miettinen,et al.  Nonlinear multiobjective optimization , 1998, International series in operations research and management science.

[28]  Francesco Bonchi,et al.  Exposing the probabilistic causal structure of discrimination , 2015, International Journal of Data Science and Analytics.