About Model Validation in Bioprocessing

In bioprocess engineering the Qualtiy by Design (QbD) initiative encourages the use of models to define design spaces. However, clear guidelines on how models for QbD are validated are still missing. In this review we provide a comprehensive overview of the validation methods, mathematical approaches, and metrics currently applied in bioprocess modeling. The methods cover analytics for data used for modeling, model training and selection, measures for predictiveness, and model uncertainties. We point out the general issues in model validation and calibration for different types of models and put this into the context of existing health authority recommendations. This review provides a starting point for developing a guide for model validation approaches. There is no one-fits-all approach, but this review should help to identify the best fitting validation method, or combination of methods, for the specific task and the type of bioprocess model that is being developed.

[1]  Jens Timmer,et al.  Likelihood based observability analysis and confidence intervals for predictions of dynamic models , 2011, BMC Systems Biology.

[2]  Ralf Pörtner,et al.  Model-assisted Design of Experiments as a concept for knowledge-based bioprocess development , 2019, Bioprocess and Biosystems Engineering.

[3]  J Ramírez,et al.  Optimization of astaxanthin production by Phaffia rhodozyma through factorial design and response surface methodology. , 2001, Journal of biotechnology.

[4]  Reiner Luttmann,et al.  Designing a fully automated multi‐bioreactor plant for fast DoE optimization of pharmaceutical protein production , 2013, Biotechnology journal.

[5]  K. Mauch,et al.  A hybrid approach identifies metabolic signatures of high‐producers for chinese hamster ovary clone selection and process optimization , 2016, Biotechnology and bioengineering.

[6]  Krist V. Gernaey,et al.  Output uncertainty of dynamic growth models: Effect of uncertain parameter estimates on model reliability , 2019, Biochemical Engineering Journal.

[7]  Massimo Morbidelli,et al.  A new generation of predictive models: The added value of hybrid models for manufacturing processes of therapeutic proteins , 2019, Biotechnology and bioengineering.

[8]  J. Smiatek,et al.  Towards a Digital Bioprocess Replica: Computational Approaches in Biopharmaceutical Development and Manufacturing. , 2020, Trends in biotechnology.

[9]  L. Quek,et al.  Metabolic flux analysis in mammalian cell culture. , 2010, Metabolic engineering.

[11]  N. Laird Nonparametric Maximum Likelihood Estimation of a Mixing Distribution , 1978 .

[12]  F. Marini,et al.  Validation of chemometric models - a tutorial. , 2015, Analytica chimica acta.

[13]  R. C. St. John,et al.  D-Optimality for Regression Designs: A Review , 1975 .

[14]  Amir F. Atiya,et al.  Comprehensive Review of Neural Network-Based Prediction Intervals and New Advances , 2011, IEEE Transactions on Neural Networks.

[15]  Jan Müller,et al.  Model uncertainty-based evaluation of process strategies during scale-up of biopharmaceutical processes , 2020, Comput. Chem. Eng..

[16]  J. Smiatek,et al.  Validation Is Not Verification: Precise Terminology and Scientific Methods in Bioprocess Modeling. , 2021, Trends in biotechnology.

[17]  Richard D Riley,et al.  Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small , 2020, Journal of clinical epidemiology.

[18]  C. Metz Basic principles of ROC analysis. , 1978, Seminars in nuclear medicine.

[19]  C. Mandenius,et al.  Modeling Suspension Cultures of Microbial and Mammalian Cells with an Adaptable Six‐Compartment Model , 2017 .

[20]  Annette M. Molinaro,et al.  Prediction error estimation: a comparison of resampling methods , 2005, Bioinform..

[21]  Oliver Spadiut,et al.  Monitoring E. coli Cell Integrity by ATR-FTIR Spectroscopy and Chemometrics: Opportunities and Caveats , 2021, Processes.

[22]  N Oreskes,et al.  Verification, Validation, and Confirmation of Numerical Models in the Earth Sciences , 1994, Science.

[23]  Michael Thompson,et al.  Harmonized guidelines for single-laboratory validation of methods of analysis (IUPAC Technical Report) , 2002 .

[24]  Elaine B. Martin,et al.  Model selection for partial least squares regression , 2002 .

[25]  Marco Viceconti,et al.  In Silico Trials: Verification, Validation And Uncertainty Quantification Of Predictive Models Used In The Regulatory Evaluation Of Biomedical Products. , 2020, Methods.

[26]  Ana P. Teixeira,et al.  Hybrid semi-parametric mathematical systems: bridging the gap between systems biology and process engineering. , 2007, Journal of biotechnology.

[27]  R. Burdick,et al.  Assessing Equivalence of Two Assays Using Sensitivity and Specificity , 2007, Journal of biopharmaceutical statistics.

[28]  Timothy M. Schaerf,et al.  Multivariate limit of detection for non-linear sensor arrays , 2020 .

[29]  Efstratios N. Pistikopoulos,et al.  Bioprocess systems engineering: transferring traditional process engineering principles to industrial biotechnology , 2012, Computational and structural biotechnology journal.

[30]  G Adinarayana,et al.  Response surface methodological approach to optimize the nutritional parameters for neomycin production by Streptomyces marinensis under solid-state fermentation , 2003 .

[31]  Gang Wang,et al.  Straightforward method for calibration of mechanistic cation exchange chromatography models for industrial applications. , 2020, Biotechnology progress.

[32]  Gerald Striedner,et al.  Quality by control: Towards model predictive control of mammalian cell culture bioprocesses. , 2017, Biotechnology journal.

[33]  Rui Oliveira,et al.  A bootstrap-aggregated hybrid semi-parametric modeling framework for bioprocess development , 2019, Bioprocess and Biosystems Engineering.

[34]  Kjell Johnson,et al.  Analysis of chemometric models applied to Raman spectroscopy for monitoring key metabolites of cell culture. , 2020, Biotechnology progress.

[35]  Rob J Hyndman,et al.  Another look at measures of forecast accuracy , 2006 .

[36]  Yi-Zeng Liang,et al.  Monte Carlo cross validation , 2001 .

[37]  S. Sivakesava,et al.  Simultaneous determination of multiple components in lactic acid fermentation using FT-MIR, NIR, and FT-Raman spectroscopic techniques , 2001 .

[38]  J. Cavanaugh,et al.  The Akaike information criterion: Background, derivation, properties, application, interpretation, and refinements , 2019, WIREs Computational Statistics.

[39]  M. Vossoughi,et al.  Designed Amino Acid Feed in Improvement of Production and Quality Targets of a Therapeutic Monoclonal Antibody , 2015, PloS one.

[40]  Clifford M. Hurvich,et al.  Regression and time series model selection in small samples , 1989 .

[41]  Rudiyanto Gunawan,et al.  Bioprocess optimization under uncertainty using ensemble modeling. , 2017, Journal of biotechnology.

[42]  Ronald D. Snee,et al.  Validation of Regression Models: Methods and Examples , 1977 .

[43]  Shahrokh Shahhosseini,et al.  A methodology for modeling batch reactors using generalized dynamic neural networks , 2010 .

[44]  Gerald Striedner,et al.  Hybrid Modeling and Intensified DoE: An Approach to Accelerate Upstream Process Characterization , 2020, Biotechnology journal.

[45]  R. Huber,et al.  Progress toward forecasting product quality and quantity of mammalian cell culture processes by performance‐based modeling , 2015, Biotechnology progress.

[46]  D. Gilmore,et al.  Statistical experimental design for bioprocess modeling and optimization analysis , 2006, Applied biochemistry and biotechnology.

[47]  Moritz Stosch,et al.  Intensified design of experiments for upstream bioreactors , 2017, Engineering in life sciences.

[48]  Rui Oliveira Combining first principles modelling and artificial neural networks: a general framework , 2004, Comput. Chem. Eng..

[49]  B. van Calster,et al.  Regression shrinkage methods for clinical prediction models do not guarantee improved performance: Simulation study , 2020, Statistical methods in medical research.

[50]  Manuel Remelhe,et al.  Between the Poles of Data‐Driven and Mechanistic Modeling for Process Operation , 2017 .

[51]  Romà Tauler,et al.  Chemometrics in analytical chemistry—part II: modeling, validation, and applications , 2018, Analytical and Bioanalytical Chemistry.

[52]  Seongkyu Yoon,et al.  In‐line monitoring of amino acids in mammalian cell cultures using raman spectroscopy and multivariate chemometrics models , 2018, Engineering in life sciences.

[53]  Alexander Mitsos,et al.  Towards Model-Based Optimization for Quality by Design in Biotherapeutics Production , 2019, Computer Aided Chemical Engineering.

[54]  Bo Yang,et al.  Optimization of medium composition for the production of clavulanic acid by Streptomyces clavuligerus , 2005 .

[55]  Christoph Herwig,et al.  Workflow for Target-Oriented Parametrization of an Enhanced Mechanistic Cell Culture Model. , 2018, Biotechnology journal.

[56]  Ursula Klingmüller,et al.  Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood , 2009, Bioinform..

[57]  N. H. Hai,et al.  Detection analysis limit of nonlinear characteristics of DNA sensors with the surface modified by polypyrrole nanowires and gold nanoparticles , 2018, Journal of Science: Advanced Materials and Devices.

[58]  George Karypis,et al.  Mining bioprocess data: opportunities and challenges. , 2008, Trends in biotechnology.

[59]  Rimvydas Simutis,et al.  Hybrid Approach to State Estimation for Bioprocess Control , 2017, Bioengineering.

[60]  C. Daluwatte,et al.  Verification and validation of computational models used in biopharmaceutical manufacturing: Potential application of the ASME Verification and Validation 40 standard and FDA proposed AI/ML model life cycle management framework. , 2021, Journal of pharmaceutical sciences.

[61]  Sebastião Feyo de Azevedo,et al.  Hybrid semi-parametric modeling in process systems engineering: Past, present and future , 2014, Comput. Chem. Eng..

[62]  Francis L Martin,et al.  Improving data splitting for classification applications in spectrochemical analyses employing a random-mutation Kennard-Stone algorithm approach , 2019, Bioinform..

[63]  Christoph Herwig,et al.  Comparison of data science workflows for root cause analysis of bioprocesses , 2018, Bioprocess and Biosystems Engineering.

[64]  Federico Rischawy,et al.  Good modeling practice for industrial chromatography: Mechanistic modeling of ion exchange chromatography of a bispecific antibody , 2019, Comput. Chem. Eng..

[65]  Moritz von Stosch,et al.  Toward intensifying design of experiments in upstream bioprocess development: An industrial Escherichia coli feasibility study , 2016, Biotechnology progress.

[66]  Cleo Kontoravdi,et al.  A multi‐pronged investigation into the effect of glucose starvation and culture duration on fed‐batch CHO cell culture , 2015, Biotechnology and bioengineering.

[67]  A Delgado,et al.  Functional nodes in dynamic neural networks for bioprocess modelling , 2003, Bioprocess and biosystems engineering.

[68]  Christoph Herwig,et al.  Workflow to set up substantial target-oriented mechanistic process models in bioprocess engineering , 2017 .

[69]  Michel Salaün,et al.  A new adaptive response surface method for reliability analysis , 2013 .

[70]  Christoph Herwig,et al.  Model-Based Methods in the Biopharmaceutical Process Lifecycle , 2017, Pharmaceutical Research.

[71]  Jürgen Popp,et al.  Common mistakes in cross-validating classification models , 2017 .

[72]  Breno Maurício Marson,et al.  VALIDATION OF ANALYTICAL METHODS IN A PHARMACEUTICAL QUALITY SYSTEM: AN OVERVIEW FOCUSED ON HPLC METHODS , 2020 .

[73]  H. Akaike A new look at the statistical model identification , 1974 .

[74]  Keiji Kakumoto,et al.  Comparison of Resampling Methods for Bias-Reduced Estimation of Prediction Error: A Simulation Study Based on Real Datasets from Biomarker Discovery Studies , 2017 .