MEIGO: an open-source software suite based on metaheuristics for global optimization in systems biology and bioinformatics

BackgroundOptimization is the key to solving many problems in computational biology. Global optimization methods, which provide a robust methodology, and metaheuristics in particular have proven to be the most efficient methods for many applications. Despite their utility, there is a limited availability of metaheuristic tools.ResultsWe present MEIGO, an R and Matlab optimization toolbox (also available in Python via a wrapper of the R version), that implements metaheuristics capable of solving diverse problems arising in systems biology and bioinformatics. The toolbox includes the enhanced scatter search method (eSS) for continuous nonlinear programming (cNLP) and mixed-integer programming (MINLP) problems, and variable neighborhood search (VNS) for Integer Programming (IP) problems. Additionally, the R version includes BayesFit for parameter estimation by Bayesian inference. The eSS and VNS methods can be run on a single-thread or in parallel using a cooperative strategy. The code is supplied under GPLv3 and is available at http://www.iim.csic.es/~gingproc/meigo.html. Documentation and examples are included. The R package has been submitted to BioConductor. We evaluate MEIGO against optimization benchmarks, and illustrate its applicability to a series of case studies in bioinformatics and systems biology where it outperforms other state-of-the-art methods.ConclusionsMEIGO provides a free, open-source platform for optimization that can be applied to multiple domains of systems biology and bioinformatics. It includes efficient state of the art metaheuristics, and its open and modular structure allows the addition of further methods.

[1]  Jeremy L. Muhlich,et al.  Properties of cell death models calibrated and compared using Bayesian approaches , 2013, Molecular systems biology.

[2]  Julio R. Banga,et al.  A cooperative strategy for parameter estimation in large scale systems biology models , 2012, BMC Systems Biology.

[3]  Roded Sharan,et al.  Reconstructing Boolean Models of Signaling , 2012, RECOMB.

[4]  Francis J. Doyle,et al.  Modeling Cortisol Dynamics in the Neuro-endocrine Axis Distinguishes Normal, Depression, and Post-traumatic Stress Disorder (PTSD) in Humans , 2012, PLoS Comput. Biol..

[5]  Robert Hooke,et al.  `` Direct Search'' Solution of Numerical and Statistical Problems , 1961, JACM.

[6]  Julio Saez-Rodriguez,et al.  Identifying Drug Effects via Pathway Alterations using an Integer Linear Programming Optimization Formulation on Phosphoproteomic Data , 2009, PLoS Comput. Biol..

[7]  Udo Reichl,et al.  Batch‐to‐batch variability of two human designer cell lines – AGE1.HN and AGE1.HN.AAT – carried out by different laboratories under defined culture conditions using a mathematical model , 2013 .

[8]  Pierre Hansen,et al.  Variable Neighborhood Decomposition Search , 1998, J. Heuristics.

[9]  Julio R. Banga,et al.  An evolutionary method for complex-process optimization , 2010, Comput. Oper. Res..

[10]  Hong Sun,et al.  Smolign: A Spatial Motifs-Based Protein Multiple Structural Alignment Method , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[11]  Steffen Klamt,et al.  A methodology for the structural and functional analysis of signaling and regulatory networks , 2006, BMC Bioinformatics.

[12]  Teodor Gabriel Crainic,et al.  Systemic Behavior of Cooperative Search Algorithms , 2002, Parallel Comput..

[13]  E. Sandgren,et al.  Nonlinear Integer and Discrete Programming in Mechanical Design Optimization , 1990 .

[14]  Eva Balsa-Canto,et al.  Parameter estimation and optimal experimental design. , 2008, Essays in biochemistry.

[15]  Jonathan M. Garibaldi,et al.  Parameter Estimation Using Metaheuristics in Systems Biology: A Comprehensive Review , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[16]  Jose A. Egea,et al.  Dynamic Optimization of Nonlinear Processes with an Enhanced Scatter Search Method , 2009 .

[17]  Harvey J. Greenberg,et al.  Opportunities for Combinatorial Optimization in Computational Biology , 2004, INFORMS J. Comput..

[18]  Francis J. Doyle,et al.  Vulnerabilities in the Tau Network and the Role of Ultrasensitive Points in Tau Pathophysiology , 2010, PLoS Comput. Biol..

[19]  Francis J. Doyle,et al.  A Detailed Modular Analysis of Heat-Shock Protein Dynamics under Acute and Chronic Stress and Its Implication in Anxiety Disorders , 2012, PloS one.

[20]  J. Stelling,et al.  Computational design tools for synthetic biology. , 2009, Current opinion in biotechnology.

[21]  Eva Balsa-Canto,et al.  AMIGO, a toolbox for advanced model identification in systems biology using global optimization , 2011, Bioinform..

[22]  Clara Higuera,et al.  Correction: Multi-Criteria Optimization of Regulation in Metabolic Networks , 2012, PLoS ONE.

[23]  A. Burgard,et al.  Optknock: A bilevel programming framework for identifying gene knockout strategies for microbial strain optimization , 2003, Biotechnology and bioengineering.

[24]  F. Glover,et al.  Fundamentals of Scatter Search and Path Relinking , 2000 .

[25]  Julio Saez-Rodriguez,et al.  CellNOptR: a flexible toolkit to train protein signaling networks to data using multiple logic formalisms , 2012, BMC Systems Biology.

[26]  Thomas Maiwald,et al.  Mathematical modeling of biochemical systems with PottersWheel. , 2012, Methods in molecular biology.

[27]  Concha Bielza,et al.  Machine Learning in Bioinformatics , 2008, Encyclopedia of Database Systems.

[28]  Beatriz Peñalver Bernabé,et al.  State–time spectrum of signal transduction logic models , 2012, Physical biology.

[29]  Darren J. Wilkinson,et al.  Bayesian methods in bioinformatics and computational systems biology , 2006, Briefings Bioinform..

[30]  Lorenz T. Biegler,et al.  On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming , 2006, Math. Program..

[31]  Richard H. Middleton,et al.  A single compartment model of pacemaking in dissasociated Substantia nigra neurons , 2013, Journal of Computational Neuroscience.

[32]  Piotr Trojanek,et al.  jPar - a simple, free and lightweight tool for parallelizing Matlab calculations on multicores and in clusters , 2015, FedCSIS.

[33]  Julio R. Banga,et al.  Novel metaheuristic for parameter estimation in nonlinear dynamic biological systems , 2006, BMC Bioinformatics.

[34]  Iiro Harjunkoski,et al.  Different transformations for solving non-convex trim-loss problems by MINLP , 1998, Eur. J. Oper. Res..

[35]  Dirk Lebiedz,et al.  An optimal experimental design approach to model discrimination in dynamic biochemical systems , 2010, Bioinform..

[36]  D. Lauffenburger,et al.  Discrete logic modelling as a means to link protein signalling networks with functional analysis of mammalian signal transduction , 2009, Molecular systems biology.

[37]  Eva Balsa-Canto,et al.  An iterative identification procedure for dynamic modeling of biochemical networks , 2010, BMC Systems Biology.

[38]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[39]  Maksat Ashyraliyev,et al.  Systems biology: parameter estimation for biochemical models , 2009, The FEBS journal.

[40]  Carmen G. Moles,et al.  Parameter estimation in biochemical pathways: a comparison of global optimization methods. , 2003, Genome research.

[41]  Julio Saez-Rodriguez,et al.  Exhaustively characterizing feasible logic models of a signaling network using Answer Set Programming , 2013, Bioinform..

[42]  Pierre Hansen,et al.  Variable Neighborhood Search , 2018, Handbook of Heuristics.

[43]  Julio R. Banga,et al.  Optimization in computational systems biology , 2008, BMC Systems Biology.

[44]  Rudiyanto Gunawan,et al.  Incremental parameter estimation of kinetic metabolic network models , 2012, BMC Systems Biology.

[45]  P Festa,et al.  On some optimization problems in molecular biology. , 2007, Mathematical biosciences.

[46]  Udo Reichl,et al.  Modeling the Intracellular Dynamics of Influenza Virus Replication To Understand the Control of Viral RNA Synthesis , 2012, Journal of Virology.

[47]  Rudiyanto Gunawan,et al.  Parameter estimation of kinetic models from metabolic profiles: two-phase dynamic decoupling method , 2011, Bioinform..

[48]  Mats Jirstrand,et al.  Systems biology Systems Biology Toolbox for MATLAB : a computational platform for research in systems biology , 2006 .

[49]  Pierre Hansen,et al.  Variable neighbourhood search: methods and applications , 2010, Ann. Oper. Res..