GPdoemd: a python package for design of experiments for model discrimination

Abstract Model discrimination identifies a mathematical model that usefully explains and predicts a given system’s behaviour. Researchers will often have several models, i.e. hypotheses, about an underlying system mechanism, but insufficient experimental data to discriminate between the models, i.e. discard inaccurate models. Given rival mathematical models and an initial experimental data set, optimal design of experiments suggests maximally informative experimental observations that maximise a design criterion weighted by prediction uncertainty. The model uncertainty requires gradients, which may not be readily available for black-box models. This paper (i) proposes a new design criterion using the Jensen-Renyi divergence, and (ii) develops a novel method replacing black-box models with Gaussian process surrogates. Using the surrogates, we marginalise out the model parameters with approximate inference. Results show these contributions working well for both classical and new test instances. We also (iii) introduce and discuss GPdoemd, the open-source implementation of the Gaussian process surrogate method.

[1]  Marc Peter Deisenroth,et al.  Design of Experiments for Model Discrimination Hybridising Analytical and Data-Driven Approaches , 2018, ICML.

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  Dirk Lebiedz,et al.  An optimal experimental design approach to model discrimination in dynamic biochemical systems , 2010, Bioinform..

[4]  Prasanth B. Nair,et al.  Scalable Gaussian Processes with Grid-Structured Eigenfunctions (GP-GRIEF) , 2018, ICML.

[5]  William G. Hunter,et al.  Designs for Discriminating Between Two Rival Models , 1965 .

[6]  Edoardo Amaldi,et al.  PGS-COM: A hybrid method for constrained non-smooth black-box optimization problems: Brief review, novel algorithm and comparative evaluation , 2014, Comput. Chem. Eng..

[7]  Michael Bortz,et al.  Machine Learning Supporting Experimental Design for Product Development in the Lab , 2018, Chemie Ingenieur Technik.

[8]  Duane A. Meeter,et al.  A Comparison of Two Model-Discrimination Criteria , 1970 .

[9]  Haitao Liu,et al.  When Gaussian Process Meets Big Data: A Review of Scalable GPs , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Benoît Chachuat,et al.  Bayesian Optimization with Dimension Scheduling: Application to Biological Systems , 2015, ArXiv.

[11]  Christos Georgakis,et al.  Design of Dynamic Experiments: A Data-Driven Methodology for the Optimization of Time-Varying Processes , 2013 .

[12]  Anthony N. Pettitt,et al.  A Review of Modern Computational Algorithms for Bayesian Optimal Design , 2016 .

[13]  Frank Nielsen,et al.  Closed-form information-theoretic divergences for statistical mixtures , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[14]  Liesbet Geris,et al.  Maximizing neotissue growth kinetics in a perfusion bioreactor: An in silico strategy using model reduction and Bayesian optimization , 2018, Biotechnology and bioengineering.

[15]  Wolfgang Marquardt,et al.  Optimal Experimental Design for Discriminating Numerous Model Candidates: The AWDC Criterion , 2010 .

[16]  Sandro Macchietto,et al.  The optimal design of dynamic experiments , 1989 .

[17]  David A. Ham,et al.  Automated Derivation of the Adjoint of High-Level Transient Finite Element Programs , 2012, SIAM J. Sci. Comput..

[18]  Selen Cremaschi,et al.  Process synthesis of biodiesel production plant using artificial neural networks as the surrogate models , 2012, Comput. Chem. Eng..

[19]  D. Cox,et al.  An Analysis of Transformations , 1964 .

[20]  Pio Forzatti,et al.  Sequential experimental design for model discrimination in the case of multiple responses , 1984 .

[21]  Guido Buzzi-Ferraris,et al.  A new sequential experimental design procedure for discriminating among rival models , 1983 .

[22]  Ann E Rundell,et al.  A Global Parallel Model Based Design of Experiments Method to Minimize Model Output Uncertainty , 2011, Bulletin of Mathematical Biology.

[23]  J. Beattie,et al.  The Thermodynamic Treatment of Chemical Equilibria in Systems Composed of Real Gases. I. An Approximate Equation for the Mass Action Function Applied to the Existing Data on the Haber Equilibrium , 1930 .

[24]  Richard D. Neidinger,et al.  Introduction to Automatic Differentiation and MATLAB Object-Oriented Programming , 2010, SIAM Rev..

[25]  R. W. Hansen,et al.  Journal of Health Economics , 2016 .

[26]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[27]  Flavio Manenti,et al.  Kinetic models analysis , 2009 .

[28]  Jack W Scannell,et al.  When Quality Beats Quantity: Decision Theory, Drug Discovery, and the Reproducibility Crisis , 2016, PloS one.

[29]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[30]  D. Dyson,et al.  Kinetic expression with diffusion correction for ammonia synthesis on industrial catalyst , 1968 .

[31]  Sumio Watanabe,et al.  Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory , 2010, J. Mach. Learn. Res..

[32]  Christodoulos A. Floudas,et al.  Global optimization of grey-box computational systems using surrogate functions and application to highly constrained oil-field operations , 2018, Comput. Chem. Eng..

[33]  Ian R. Manchester,et al.  Input design for model discrimination and fault detection via convex relaxation , 2013, 2014 American Control Conference.

[34]  Shu Yang,et al.  Optimization of Reaction Selectivity Using CFD-Based Compartmental Modeling and Surrogate-Based Optimization , 2018, Processes.

[35]  David Beymer,et al.  Closed-Form Jensen-Renyi Divergence for Mixture of Gaussians and Applications to Group-Wise Shape Registration , 2009, MICCAI.

[36]  Alberto Bemporad,et al.  The explicit linear quadratic regulator for constrained systems , 2003, Autom..

[37]  Argimiro R. Secchi,et al.  A Kriging-based approach for conjugating specific dynamic models into whole plant stationary simulations , 2018, Comput. Chem. Eng..

[38]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[39]  D. Spence,et al.  Technologies for Measuring Pharmacokinetic Profiles. , 2018, Annual review of analytical chemistry.

[40]  Sumio Watanabe,et al.  A widely applicable Bayesian information criterion , 2012, J. Mach. Learn. Res..

[41]  O. Levenspiel Chemical Reaction Engineering , 1972 .

[42]  Christian Hoffmann,et al.  Numerical aspects of uncertainty in the design of optimal experiments for model discrimination , 2016 .

[43]  Christodoulos A. Floudas,et al.  Global optimization of general constrained grey-box models: new method and its application to constrained PDEs for pressure swing adsorption , 2017, J. Glob. Optim..

[44]  Nguyễn Huy Mỹ International Federation of Automatic Control (IFAC) , 2015 .

[45]  Gürkan Sin,et al.  Superstructure Optimization of Oleochemical Processes with Surrogate Models , 2018 .

[46]  Massimiliano Barolo,et al.  A framework for model-based design of parallel experiments in dynamic systems , 2006 .

[47]  Anthony C. Atkinson,et al.  DT-optimum designs for model discrimination and parameter estimation , 2008 .

[48]  Sandro Macchietto,et al.  Statistical tools for optimal dynamic model building , 2000 .

[49]  Antony M. Overstall,et al.  Bayesian design of experiments for generalized linear models and dimensional analysis with industrial and scientific application , 2016, 1606.05892.

[50]  Amaro G. Barreto,et al.  A new approach for sequential experimental design for model discrimination , 2006 .

[51]  F. Black,et al.  The Pricing of Options and Corporate Liabilities , 1973, Journal of Political Economy.

[52]  Dirk Lebiedz,et al.  A robust optimization approach to experimental design for model discrimination of dynamical systems , 2011, Math. Program..

[53]  Mark Burgin,et al.  Foundations of Information Theory , 2008, ArXiv.

[54]  William J. Hill,et al.  Discrimination Among Mechanistic Models , 1967 .

[55]  Steven P. Asprey,et al.  On the design of optimally informative dynamic experiments for model discrimination in multiresponse nonlinear situations , 2003 .

[56]  Anthony N. Pettitt,et al.  Fully Bayesian Experimental Design for Pharmacokinetic Studies , 2015, Entropy.

[57]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[58]  Neil D. Lawrence,et al.  Batch Bayesian Optimization via Local Penalization , 2015, AISTATS.

[59]  Matthew J. Realff,et al.  Optimization and Validation of Steady-State Flowsheet Simulation Metamodels , 2002 .

[60]  I. Grossmann,et al.  An algorithm for the use of surrogate models in modular flowsheet optimization , 2008 .

[61]  A. Atkinson,et al.  The design of experiments for discriminating between two rival models , 1975 .

[62]  Ian R. Manchester,et al.  Model Predictive Control Combined with Model Discrimination and Fault Detection , 2014 .

[63]  Dieter Fox,et al.  GP-BayesFilters: Bayesian filtering using Gaussian process prediction and observation models , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[64]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[65]  Sandro Macchietto,et al.  Model-Based Design of Parallel Experiments , 2007 .

[66]  Peter A. J. Hilbers,et al.  Optimal experiment design for model selection in biochemical networks , 2014, BMC Systems Biology.

[67]  Katya Scheinberg,et al.  Introduction to derivative-free optimization , 2010, Math. Comput..

[68]  Christodoulos A. Floudas,et al.  Global optimization advances in Mixed-Integer Nonlinear Programming, MINLP, and Constrained Derivative-Free Optimization, CDFO , 2016, Eur. J. Oper. Res..

[69]  Arno Solin,et al.  Variational Fourier Features for Gaussian Processes , 2016, J. Mach. Learn. Res..

[70]  Anthony N. Pettitt,et al.  A Sequential Monte Carlo Algorithm to Incorporate Model Uncertainty in Bayesian Sequential Design , 2014 .

[71]  Uwe D. Hanebeck,et al.  Analytic moment-based Gaussian process filtering , 2009, ICML '09.

[72]  S B Duffull,et al.  Optimal Design Criteria for Discrimination and Estimation in Nonlinear Models , 2009, Journal of biopharmaceutical statistics.

[73]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[74]  Liesbet Geris,et al.  Bayesian Multiobjective Optimisation With Mixed Analytical and Black-Box Functions: Application to Tissue Engineering , 2019, IEEE Transactions on Biomedical Engineering.

[75]  Robert M. Plenge,et al.  Disciplined approach to drug discovery and early development , 2016, Science Translational Medicine.

[76]  Neil D. Lawrence,et al.  Fast Forward Selection to Speed Up Sparse Gaussian Process Regression , 2003, AISTATS.

[77]  Guido Buzzi-Ferraris,et al.  An improved version of a sequential design criterion for discriminating among rival multiresponse models , 1990 .

[78]  Marc Peter Deisenroth,et al.  Design of Experiments for Model Discrimination using Gaussian Process Surrogate Models , 2018 .

[79]  Barak A. Pearlmutter,et al.  Automatic differentiation in machine learning: a survey , 2015, J. Mach. Learn. Res..

[80]  Alexander J. Smola,et al.  Sparse Greedy Gaussian Process Regression , 2000, NIPS.

[81]  M. S. Rao,et al.  Use of expected likelihood in sequential model discrimination in multiresponse systems , 1977 .