Evolutionary approaches for the reverse-engineering of gene regulatory networks: A study on a biologically realistic dataset

BackgroundInferring gene regulatory networks from data requires the development of algorithms devoted to structure extraction. When only static data are available, gene interactions may be modelled by a Bayesian Network (BN) that represents the presence of direct interactions from regulators to regulees by conditional probability distributions. We used enhanced evolutionary algorithms to stochastically evolve a set of candidate BN structures and found the model that best fits data without prior knowledge.ResultsWe proposed various evolutionary strategies suitable for the task and tested our choices using simulated data drawn from a given bio-realistic network of 35 nodes, the so-called insulin network, which has been used in the literature for benchmarking. We assessed the inferred models against this reference to obtain statistical performance results. We then compared performances of evolutionary algorithms using two kinds of recombination operators that operate at different scales in the graphs. We introduced a niching strategy that reinforces diversity through the population and avoided trapping of the algorithm in one local minimum in the early steps of learning. We show the limited effect of the mutation operator when niching is applied. Finally, we compared our best evolutionary approach with various well known learning algorithms (MCMC, K2, greedy search, TPDA, MMHC) devoted to BN structure learning.ConclusionWe studied the behaviour of an evolutionary approach enhanced by niching for the learning of gene regulatory networks with BN. We show that this approach outperforms classical structure learning methods in elucidating the original model. These results were obtained for the learning of a bio-realistic network and, more importantly, on various small datasets. This is a suitable approach for learning transcriptional regulatory networks from real datasets without prior knowledge.

[1]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[2]  Nir Friedman,et al.  Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks , 2004, Machine Learning.

[3]  Pedro Larrañaga,et al.  Structure Learning of Bayesian Networks by Genetic Algorithms: A Performance Analysis of Control Parameters , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Nir Friedman,et al.  Inferring Cellular Networks Using Probabilistic Graphical Models , 2004, Science.

[5]  Little,et al.  [Lecture Notes in Mathematics] Combinatorial Mathematics V Volume 622 || Counting unlabeled acyclic digraphs , 1977 .

[6]  Satoru Miyano,et al.  Estimation of Genetic Networks and Functional Structures Between Genes by Using Bayesian Networks and Nonparametric Regression , 2001, Pacific Symposium on Biocomputing.

[7]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[8]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[9]  Nir Friedman,et al.  Inferring subnetworks from perturbed expression profiles , 2001, ISMB.

[10]  Kathryn B. Laskey,et al.  Learning Bayesian networks from incomplete data using evolutionary algorithms , 1999 .

[11]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[12]  Tommi S. Jaakkola,et al.  Using Graphical Models and Genomic Expression Data to Statistically Validate Models of Genetic Regulatory Networks , 2000, Pacific Symposium on Biocomputing.

[13]  M. Reinders,et al.  Genetic network modeling. , 2002, Pharmacogenomics.

[14]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[15]  Constantin F. Aliferis,et al.  Causal Explorer: A Causal Probabilistic Network Learning Toolkit for Biomedical Discovery , 2003, METMBS.

[16]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[17]  David J. Spiegelhalter,et al.  Probabilistic Networks and Expert Systems , 1999, Information Science and Statistics.

[18]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[19]  David Maxwell Chickering,et al.  Learning Bayesian Networks is NP-Complete , 2016, AISTATS.

[20]  Nicolas Brunel,et al.  Estimating parameters and hidden variables in non-linear state-space models based on ODEs for biological networks inference , 2007, Bioinform..

[21]  D Husmeier,et al.  Reverse engineering of genetic networks with Bayesian networks. , 2003, Biochemical Society transactions.

[22]  Wray L. Buntine A Guide to the Literature on Learning Probabilistic Networks from Data , 1996, IEEE Trans. Knowl. Data Eng..

[23]  Hidde de Jong,et al.  Modeling and Simulation of Genetic Regulatory Systems: A Literature Review , 2002, J. Comput. Biol..

[24]  David A. Bell,et al.  Learning Bayesian networks from data: An information-theory based approach , 2002, Artif. Intell..

[25]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[26]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[27]  William H. Hsu,et al.  A Permutation Genetic Algorithm For Variable Ordering In Learning Bayesian Networks From Data , 2002, GECCO.

[28]  Lyle H. Ungar,et al.  Using prior knowledge to improve genetic network reconstruction from microarray data , 2004, Silico Biol..

[29]  Thomas Gärtner,et al.  Graph kernels and Gaussian processes for relational reinforcement learning , 2006, Machine Learning.

[30]  Pedro Larrañaga,et al.  Analysis of the behaviour of genetic algorithms when learning Bayesian network structure from data , 1997, Pattern Recognit. Lett..

[31]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[32]  Jesper Tegnér,et al.  Growing Bayesian network models of gene networks from seed genes , 2005, ECCB/JBI.

[33]  K. Dejong,et al.  An analysis of the behavior of a class of genetic adaptive systems , 1975 .

[34]  Samir W. Mahfoud Niching methods for genetic algorithms , 1996 .

[35]  R. W. Robinson Counting unlabeled acyclic digraphs , 1977 .

[36]  Judea Pearl,et al.  Equivalence and Synthesis of Causal Models , 1990, UAI.

[37]  Carlos Cotta,et al.  Analyzing Directed Acyclic Graph Recombination , 2001, Fuzzy Days.

[38]  Paolo Giudici,et al.  Improving Markov Chain Monte Carlo Model Search for Data Mining , 2004, Machine Learning.

[39]  David Maxwell Chickering,et al.  A Transformational Characterization of Equivalent Bayesian Network Structures , 1995, UAI.

[40]  Pedro Larrañaga,et al.  Learning Bayesian network structures by searching for the best ordering with genetic algorithms , 1996, IEEE Trans. Syst. Man Cybern. Part A.

[41]  Robert Castelo,et al.  Improved learning of Bayesian networks , 2001, UAI.

[42]  David Heckerman Likelihoods and Parameter Priors for Bayesian Networks , 1995 .

[43]  Richard Scheines,et al.  Constructing Bayesian Network Models of Gene Expression Networks from Microarray Data , 2000 .

[44]  BuntineWray A Guide to the Literature on Learning Probabilistic Networks from Data , 1996 .

[45]  Carlos Cotta,et al.  Towards a More Efficient Evolutionary Induction of Bayesian Networks , 2002, PPSN.

[46]  Kwong-Sak Leung,et al.  Using Evolutionary Programming and Minimum Description Length Principle for Data Mining of Bayesian Networks , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[47]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .