A parallel metaheuristic for large mixed-integer dynamic optimization problems, with applications in computational biology

Background We consider a general class of global optimization problems dealing with nonlinear dynamic models. Although this class is relevant to many areas of science and engineering, here we are interested in applying this framework to the reverse engineering problem in computational systems biology, which yields very large mixed-integer dynamic optimization (MIDO) problems. In particular, we consider the framework of logic-based ordinary differential equations (ODEs). Methods We present saCeSS2, a parallel method for the solution of this class of problems. This method is based on an parallel cooperative scatter search metaheuristic, with new mechanisms of self-adaptation and specific extensions to handle large mixed-integer problems. We have paid special attention to the avoidance of convergence stagnation using adaptive cooperation strategies tailored to this class of problems. Results We illustrate its performance with a set of three very challenging case studies from the domain of dynamic modelling of cell signaling. The simpler case study considers a synthetic signaling pathway and has 84 continuous and 34 binary decision variables. A second case study considers the dynamic modeling of signaling in liver cancer using high-throughput data, and has 135 continuous and 109 binaries decision variables. The third case study is an extremely difficult problem related with breast cancer, involving 690 continuous and 138 binary decision variables. We report computational results obtained in different infrastructures, including a local cluster, a large supercomputer and a public cloud platform. Interestingly, the results show how the cooperation of individual parallel searches modifies the systemic properties of the sequential algorithm, achieving superlinear speedups compared to an individual search (e.g. speedups of 15 with 10 cores), and significantly improving (above a 60%) the performance with respect to a non-cooperative parallel scheme. The scalability of the method is also good (tests were performed using up to 300 cores). Conclusions These results demonstrate that saCeSS2 can be used to successfully reverse engineer large dynamic models of complex biological pathways. Further, these results open up new possibilities for other MIDO-based large-scale applications in the life sciences such as metabolic engineering, synthetic biology, drug scheduling.

[1]  David Henriques,et al.  Reverse engineering of logic-based differential equation models using a mixed-integer dynamic optimization approach , 2015, Bioinform..

[2]  F. Glover,et al.  Fundamentals of Scatter Search and Path Relinking , 2000 .

[3]  Julio R. Banga,et al.  Novel metaheuristic for parameter estimation in nonlinear dynamic biological systems , 2006, BMC Bioinformatics.

[4]  Marcin J. Skwark,et al.  Improving Contact Prediction along Three Dimensions , 2014, PLoS Comput. Biol..

[5]  P. Pardalos,et al.  Handbook of global optimization , 1995 .

[6]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[7]  Isabel M. Tienda-Luna,et al.  Reverse engineering gene regulatory networks , 2009, IEEE Signal Processing Magazine.

[8]  Evan O. Paull,et al.  Inferring causal molecular networks: empirical assessment through a community-based effort , 2016, Nature Methods.

[9]  R. Linding,et al.  Simplistic pathways or complex networks? , 2010, Current opinion in genetics & development.

[10]  Julio R. Banga,et al.  An evolutionary method for complex-process optimization , 2010, Comput. Oper. Res..

[11]  P. Pardalos,et al.  State of the art in global optimization: computational methods and applications , 1996 .

[12]  Eva Balsa-Canto,et al.  Mixed-integer non-linear optimal control in systems biology and biotechnology: numerical methods and a software toolbox , 2010 .

[13]  Michael L. Thomas,et al.  Minimization of Childhood Maltreatment Is Common and Consequential: Results from a Large, Multinational Sample Using the Childhood Trauma Questionnaire , 2016, PloS one.

[14]  Julio Saez-Rodriguez,et al.  Fuzzy Logic Analysis of Kinase Pathway Crosstalk in TNF/EGF/Insulin-Induced Signaling , 2007, PLoS Comput. Biol..

[15]  Paolo Bientinesi,et al.  Can cloud computing reach the top500? , 2009, UCHPC-MAW '09.

[16]  Antonio Flores-Tlacuahuac,et al.  Simultaneous mixed-integer dynamic optimization for integrated design and control , 2007, Comput. Chem. Eng..

[17]  George M. Siouris,et al.  Applied Optimal Control: Optimization, Estimation, and Control , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[18]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[19]  Efstratios N. Pistikopoulos,et al.  Towards an efficient numerical procedure for mixed integer optimal control , 1997 .

[20]  Johannes Jaeger,et al.  Reverse engineering a gene network using an asynchronous parallel evolution strategy , 2010, BMC Systems Biology.

[21]  Carol S. Woodward,et al.  Enabling New Flexibility in the SUNDIALS Suite of Nonlinear and Differential/Algebraic Equation Solvers , 2020, ACM Trans. Math. Softw..

[22]  D. Lauffenburger,et al.  Networks Inferred from Biochemical Data Reveal Profound Differences in Toll-like Receptor and Inflammatory Signaling between Normal and Transformed Hepatocytes* , 2010, Molecular & Cellular Proteomics.

[23]  R Bellman,et al.  DYNAMIC PROGRAMMING AND LAGRANGE MULTIPLIERS. , 1956, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Raymond Ros,et al.  Real-Parameter Black-Box Optimization Benchmarking 2009: Experimental Setup , 2009 .

[25]  S. Kauffman Metabolic stability and epigenesis in randomly constructed genetic nets. , 1969, Journal of theoretical biology.

[26]  Jörg Stelling,et al.  Systems interface biology , 2006, Journal of The Royal Society Interface.

[27]  Efstratios N. Pistikopoulos,et al.  'Closing the loop' in biological systems modeling - From the in silico to the in vitro , 2011, Autom..

[28]  H G Bock,et al.  Annihilation of limit-cycle oscillations by identification of critical perturbing stimuli via mixed-integer optimal control. , 2005, Physical review letters.

[29]  Denis Thieffry,et al.  Logical Modeling and Dynamical Analysis of Cellular Networks , 2016, Front. Genet..

[30]  Eva Balsa-Canto,et al.  AMIGO, a toolbox for advanced model identification in systems biology using global optimization , 2011, Bioinform..

[31]  Gonzalo Guillén-Gosálbez,et al.  Identification of regulatory structure and kinetic parameters of biochemical networks via mixed-integer dynamic optimization , 2013, BMC Systems Biology.

[32]  L. Biegler,et al.  Advances in simultaneous strategies for dynamic process optimization , 2002 .

[33]  Julio Saez-Rodriguez,et al.  CellNOptR: a flexible toolkit to train protein signaling networks to data using multiple logic formalisms , 2012, BMC Systems Biology.

[34]  S Waldherr,et al.  Parameter identification, experimental design and model falsification for biological network models using semidefinite programming. , 2010, IET systems biology.

[35]  Julio R. Banga,et al.  Design Principles of Biological Oscillators through Optimization: Forward and Reverse Analysis , 2016, PloS one.

[36]  Eva Balsa-Canto,et al.  Hybrid optimization method with general switching strategy for parameter estimation , 2008, BMC Systems Biology.

[37]  Hidde de Jong,et al.  Modeling and Simulation of Genetic Regulatory Systems: A Literature Review , 2002, J. Comput. Biol..

[38]  Graham Kendall,et al.  A Classification of Hyper-heuristic Approaches , 2010 .

[39]  D. Lauffenburger,et al.  Discrete logic modelling as a means to link protein signalling networks with functional analysis of mammalian signal transduction , 2009, Molecular systems biology.

[40]  Jonathan R. Potts,et al.  Animal Interactions and the Emergence of Territoriality , 2011, PLoS Comput. Biol..

[41]  Klaus Schittkowski,et al.  A trust region SQP algorithm for mixed-integer nonlinear programming , 2007, Optim. Lett..

[42]  Julio R. Banga,et al.  Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy , 2017, BMC Bioinformatics.

[43]  Asim Munawar,et al.  Advanced genetic algorithm to solve MINLP problems over GPU , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[44]  Miguel Rocha,et al.  Data-driven reverse engineering of signaling pathways using ensembles of dynamic models , 2017, PLoS Comput. Biol..

[45]  T. Helikar,et al.  Emergent decision-making in biological signal transduction networks , 2008, Proceedings of the National Academy of Sciences.

[46]  Christodoulos A. Floudas,et al.  Global optimization advances in Mixed-Integer Nonlinear Programming, MINLP, and Constrained Derivative-Free Optimization, CDFO , 2016, Eur. J. Oper. Res..

[47]  Roland Eils,et al.  Data-Derived Modeling Characterizes Plasticity of MAPK Signaling in Melanoma , 2014, PLoS Comput. Biol..

[48]  Antje Baer,et al.  State Of The Art In Global Optimization Computational Methods And Applications , 2016 .

[49]  Moritz Diehl,et al.  The integer approximation error in mixed-integer optimal control , 2012, Math. Program..

[50]  Teodor Gabriel Crainic,et al.  Parallel Strategies for Meta-Heuristics , 2003, Handbook of Metaheuristics.

[51]  Thomas Stützle,et al.  MORE: Mixed Optimization for Reverse Engineering—An Application to Modeling Biological Networks Response via Sparse Systems of Nonlinear Differential Equations , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[52]  Vassilios Vassiliadis,et al.  Computational solution of dynamic optimization problems with general differential-algebraic constraints , 1993 .

[53]  Francisco C. Santos,et al.  Cooperation Prevails When Individuals Adjust Their Social Ties , 2006, PLoS Comput. Biol..

[54]  Julio R. Banga,et al.  Multicriteria global optimization for biocircuit design , 2014, BMC Systems Biology.

[55]  Julio R. Banga,et al.  Extended ant colony optimization for non-convex mixed integer nonlinear programming , 2009, Comput. Oper. Res..

[56]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[57]  Eva Balsa-Canto,et al.  DOTcvpSB, a software toolbox for dynamic optimization in systems biology , 2009, BMC Bioinformatics.

[58]  R. Sargent,et al.  Solution of a Class of Multistage Dynamic Optimization Problems. 2. Problems with Path Constraints , 1994 .

[59]  Julio R. Banga,et al.  SYNBADm: a tool for optimization-based automated design of synthetic gene circuits , 2016, Bioinform..

[60]  J R Banga,et al.  Multi-objective mixed integer strategy for the optimisation of biological networks. , 2010, IET systems biology.

[61]  Julio R. Banga,et al.  Reverse engineering and identification in systems biology: strategies, perspectives and challenges , 2014, Journal of The Royal Society Interface.

[62]  Sabela Ramos,et al.  Performance analysis of HPC applications in the cloud , 2013, Future Gener. Comput. Syst..

[63]  Julio Saez-Rodriguez,et al.  Training Signaling Pathway Maps to Biochemical Data with Constrained Fuzzy Logic: Quantitative Analysis of Liver Cell Responses to Inflammatory Stimuli , 2011, PLoS Comput. Biol..

[64]  Peter K. Sorger,et al.  Logic-Based Models for the Analysis of Cell Signaling Networks† , 2010, Biochemistry.

[65]  P. Pardalos,et al.  Optimization in computational chemistry and molecular biology : local and global approaches , 2000 .

[66]  Ursula Faber Global Optimization In Engineering Design , 2016 .

[67]  Julio R. Banga,et al.  An Extended Ant Colony Optimization Algorithm for Integrated Process and Control System Design , 2009 .

[68]  Filippo Menolascina,et al.  Engineering and control of biological systems: A new way to tackle complex diseases , 2012, FEBS letters.

[69]  Sven Leyffer,et al.  Solving Large MINLPs on Computational Grids , 2002 .

[70]  Christodoulos A. Floudas,et al.  A review of recent advances in global optimization , 2009, J. Glob. Optim..

[71]  A. V. Grimstone Molecular biology of the cell (3rd edn) , 1995 .

[72]  David Henriques,et al.  Modeling signaling networks with different formalisms: a preview. , 2013, Methods in molecular biology.

[73]  Julio R. Banga,et al.  A Tabu search-based algorithm for mixed-integer nonlinear problems and its application to integrated process and control system design , 2008, Comput. Chem. Eng..

[74]  Cheng-Liang Chen,et al.  Model-Based Insulin Therapy Scheduling: A Mixed-Integer Nonlinear Dynamic Optimization Approach , 2009 .

[75]  P. I. Barton,et al.  Global methods for dynamic optimization and mixed-integer dynamic optimization , 2006 .

[76]  Julio R. Banga,et al.  Optimization in computational systems biology , 2008, BMC Systems Biology.

[77]  Ioannis Xenarios,et al.  A method for the generation of standardized qualitative dynamical systems of regulatory networks , 2005, Theoretical Biology and Medical Modelling.

[78]  Frédéric Messine,et al.  Efficient upper and lower bounds for global mixed-integer optimal control , 2015, J. Glob. Optim..

[79]  Rudiyanto Gunawan,et al.  Iterative approach to model identification of biological networks , 2005, BMC Bioinformatics.

[80]  Jose A. Egea,et al.  Dynamic Optimization of Nonlinear Processes with an Enhanced Scatter Search Method , 2009 .

[81]  Harvey J. Greenberg,et al.  Opportunities for Combinatorial Optimization in Computational Biology , 2004, INFORMS J. Comput..

[82]  Efstratios N. Pistikopoulos,et al.  Optimal delivery of chemotherapeutic agents in cancer , 2008, Comput. Chem. Eng..

[83]  Richard Bonneau,et al.  The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo , 2006, Genome Biology.

[84]  Daniel Liberzon,et al.  Calculus of Variations and Optimal Control Theory: A Concise Introduction , 2012 .

[85]  Steffen Klamt,et al.  Transforming Boolean models to continuous models: methodology and application to T-cell receptor signaling , 2009, BMC Systems Biology.

[86]  Concha Bielza,et al.  Machine Learning in Bioinformatics , 2008, Encyclopedia of Database Systems.

[87]  Rui-Sheng Wang,et al.  Boolean modeling in systems biology: an overview of methodology and applications , 2012, Physical biology.

[88]  Beatriz Peñalver Bernabé,et al.  State–time spectrum of signal transduction logic models , 2012, Physical biology.

[89]  Leon Glass,et al.  Reverse Engineering the Gap Gene Network of Drosophila melanogaster , 2006, PLoS Comput. Biol..

[90]  Klaus Schittkowski,et al.  A comparative study of SQP-type algorithms for nonlinear and nonconvex mixed-integer optimization , 2012, Math. Program. Comput..

[91]  Douglas B. Kell,et al.  Non-linear optimization of biochemical pathways: applications to metabolic engineering and parameter estimation , 1998, Bioinform..

[92]  Julio R. Banga,et al.  Enhanced parallel Differential Evolution algorithm for problems in computational systems biology , 2015, Appl. Soft Comput..

[93]  Fabian J. Theis,et al.  Odefy -- From discrete to continuous models , 2010, BMC Bioinformatics.

[94]  Enrique Alba,et al.  Parallel Metaheuristics: A New Class of Algorithms , 2005 .

[95]  Carmen G. Moles,et al.  Parameter estimation in biochemical pathways: a comparison of global optimization methods. , 2003, Genome research.

[96]  John Shalf,et al.  Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[97]  Constantinos Evangelinos,et al.  Cloud Computing for parallel Scientific HPC Applications: Feasibility of Running Coupled Atmosphere- , 2008 .

[98]  Wenguang Chen,et al.  Cloud versus in-house cluster: Evaluating Amazon cluster compute instances for running MPI applications , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[99]  Eva Balsa-Canto,et al.  Dynamic optimization of bioprocesses: efficient and robust numerical strategies. , 2005, Journal of biotechnology.

[100]  Geoffrey C. Fox,et al.  High Performance Parallel Computing with Clouds and Cloud Technologies , 2009, CloudComp.

[101]  Ralf Östermark,et al.  Solving difficult mixed integer and disjunctive non-linear problems on single and parallel processors , 2014, Appl. Soft Comput..

[102]  Sebastian Sager,et al.  A BENCHMARK LIBRARY OF MIXED-INTEGER OPTIMAL CONTROL PROBLEMS , 2012 .

[103]  D. Lauffenburger,et al.  Physicochemical modelling of cell signalling pathways , 2006, Nature Cell Biology.

[104]  S. Friend,et al.  Crowdsourcing biomedical research: leveraging communities as innovation engines , 2016, Nature Reviews Genetics.

[105]  Pieter Rein ten Wolde,et al.  Differential Affinity and Catalytic Activity of CheZ in E. coli Chemotaxis , 2009, PLoS Comput. Biol..

[106]  A Kremling,et al.  Systems biology--an engineering perspective. , 2007, Journal of biotechnology.