Opportunities of Hybrid Model-based Reinforcement Learning for Cell Therapy Manufacturing Process Control

Driven by the key challenges of cell therapy manufacturing, including high complexity, high uncertainty, and very limited process observations, we propose a hybrid model-based reinforcement learning (RL) to efficiently guide process control. We first create a probabilistic knowledge graph (KG) hybrid model characterizing the riskand science-based understanding of biomanufacturing process mechanisms and quantifying inherent stochasticity, e.g., batch-to-batch variation. It can capture the key features, including nonlinear reactions, nonstationary dynamics, and partially observed state. This hybrid model can leverage existing mechanistic models and facilitate learning from heterogeneous process data. A computational sampling approach is used to generate posterior samples quantifying model uncertainty. Then, we introduce hybrid model-based Bayesian RL, accounting for both inherent stochasticity and model uncertainty, to guide optimal, robust, and interpretable dynamic decision making. Cell therapy manufacturing examples are used to empirically demonstrate that the proposed framework can outperform the classical deterministic mechanistic model assisted process optimization.

[1]  Jeremy S. Conner,et al.  Process monitoring and quality variable prediction utilizing PLS in industrial fed-batch cell culture , 2009 .

[2]  Rui Oliveira,et al.  Hybrid modeling as a QbD/PAT tool in process development: an industrial E. coli case study , 2016, Bioprocess and Biosystems Engineering.

[3]  Tao Wang,et al.  Bayesian sparse sampling for on-line reward optimization , 2005, ICML.

[4]  M. Bayram,et al.  Numerical methods for simulation of stochastic differential equations , 2018 .

[5]  R. Braatz,et al.  Selective Crystallization of the Metastable α-Form of l-Glutamic Acid using Concentration Feedback Control , 2009 .

[6]  J. Kocijan,et al.  Gaussian process model based predictive control , 2004, Proceedings of the 2004 American Control Conference.

[7]  Mondher Toumi,et al.  Advanced therapy medicinal products: current and future perspectives , 2016, Journal of market access & health policy.

[8]  Panagiotis Petsagkourakis,et al.  Safe Chance Constrained Reinforcement Learning for Batch Process Control , 2021, Comput. Chem. Eng..

[9]  Richard D. Braatz,et al.  Control systems technology in the advanced manufacturing of biologic drugs , 2015, 2015 IEEE Conference on Control Applications (CCA).

[10]  Ananth Krishnamurthy,et al.  Performance Guarantees and Optimal Purification Decisions for Engineered Proteins , 2017, Oper. Res..

[11]  Carl-Fredrik Mandenius,et al.  Measurement, Monitoring, Modelling and Control of Bioprocesses , 2013 .

[12]  Richard D. Braatz,et al.  Model‐based design of a plant‐wide control strategy for a continuous pharmaceutical plant , 2013 .

[13]  Angel Rain-Franco,et al.  Cryopreservation and Resuscitation of Natural Aquatic Prokaryotic Communities , 2021, Frontiers in Microbiology.

[14]  Jürgen Hubbuch,et al.  Rational and systematic protein purification process development: the next generation. , 2009, Trends in biotechnology.

[15]  Panagiotis Petsagkourakis,et al.  Constrained Model-Free Reinforcement Learning for Process Optimization , 2020, Comput. Chem. Eng..

[16]  P. Moral,et al.  Sequential Monte Carlo samplers , 2002, cond-mat/0212648.

[17]  Ilya O. Ryzhov,et al.  Policy Optimization in Bayesian Network Hybrid Models of Biomanufacturing Processes , 2021, ArXiv.

[18]  Haeun Yoo,et al.  Reinforcement learning based optimal control of batch processes using Monte-Carlo deep deterministic policy gradient with phase segmentation , 2021, Comput. Chem. Eng..

[19]  R. Bhushan Gopaluni,et al.  Deep Reinforcement Learning for Process Control: A Primer for Beginners , 2020, ArXiv.

[20]  Stefan Streif,et al.  Fast stochastic model predictive control of end-to-end continuous pharmaceutical manufacturing 1 1Financial support from Novartis is acknowledged. , 2018 .

[21]  Richard D. Braatz,et al.  Model Predictive Control of an Integrated Continuous Pharmaceutical Manufacturing Pilot Plant , 2017 .

[22]  Marija Cvijovic,et al.  Kinetic models in industrial biotechnology - Improving cell factory performance. , 2014, Metabolic engineering.

[23]  John R. Clegg,et al.  Cell therapies in the clinic , 2021, Bioengineering & translational medicine.

[24]  Ana P. Teixeira,et al.  Hybrid semi-parametric mathematical systems: bridging the gap between systems biology and process engineering. , 2007, Journal of biotechnology.

[25]  Christoph Herwig,et al.  Data science tools and applications on the way to Pharma 4.0. , 2019, Drug discovery today.

[26]  R. B. Gopaluni,et al.  Deep reinforcement learning approaches for process control , 2017, 2017 6th International Symposium on Advanced Control of Industrial Processes (AdCONIP).

[27]  René H. Wijffels,et al.  Multivariate data analysis as a PAT tool for early bioprocess development data. , 2013, Journal of biotechnology.

[28]  George M. Bollas,et al.  Using hybrid neural networks in scaling up an FCC model from a pilot plant to an industrial unit , 2003 .

[29]  Keith McDonald,et al.  ICH Q11: development and manufacture of drug substances–chemical and biotechnological/biological entities , 2012 .

[30]  Guillaume Deffuant,et al.  Adaptive approximate Bayesian computation for complex models , 2011, Computational Statistics.

[31]  Heidar A. Malki,et al.  Control Systems Technology , 2001 .

[32]  C. Rauh,et al.  Kinetic Modeling and Numerical Simulation as Tools to Scale Microalgae Cell Membrane Permeabilization by Means of Pulsed Electric Fields (PEF) From Lab to Pilot Plants , 2020, Frontiers in Bioengineering and Biotechnology.

[33]  Haeun Yoo,et al.  Reinforcement learning for batch process control: Review and perspectives , 2021, Annu. Rev. Control..

[34]  Franco Blanchini,et al.  Control-theoretic methods for biological networks , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[35]  Amanda Minter,et al.  Approximate Bayesian Computation for infectious disease modelling. , 2019, Epidemics.

[36]  Shie Mannor,et al.  Bayesian Reinforcement Learning: A Survey , 2015, Found. Trends Mach. Learn..

[37]  Rui Oliveira,et al.  In situ 2D fluorometry and chemometric monitoring of mammalian cell cultures , 2009, Biotechnology and bioengineering.

[38]  Sebastião Feyo de Azevedo,et al.  Hybrid semi-parametric modeling in process systems engineering: Past, present and future , 2014, Comput. Chem. Eng..

[39]  Brian Glennon,et al.  Glucose concentration control of a fed-batch mammalian cell bioprocess using a nonlinear model predictive controller , 2014 .

[40]  Elizabeth A. Cheeseman,et al.  A mechanistic model of erythroblast growth inhibition providing a framework for optimisation of cell therapy manufacturing , 2018 .

[41]  Yishay Mansour,et al.  A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.

[42]  Richard D. Braatz,et al.  Challenges and opportunities in biopharmaceutical manufacturing control , 2018, Comput. Chem. Eng..

[43]  Hamidreza Mehdizadeh,et al.  Generic Raman‐based calibration models enabling real‐time monitoring of cell culture bioreactors , 2015, Biotechnology progress.

[44]  Manuel Remelhe,et al.  Between the Poles of Data‐Driven and Mechanistic Modeling for Process Operation , 2017 .

[45]  Jeremy E. Oakley,et al.  Approximate Bayesian Computation and simulation based inference for complex stochastic epidemic models , 2018 .

[46]  Jay H. Lee,et al.  Robust Dual Control of Batch Processes with Parametric Uncertainty using Proximal Policy Optimization , 2020, 2020 59th IEEE Conference on Decision and Control (CDC).

[47]  Dakuo He,et al.  Batch-to-batch control of particle size distribution in cobalt oxalate synthesis process based on hybrid model , 2012 .

[48]  R. Dressel Effects of histocompatibility and host immune responses on the tumorigenicity of pluripotent stem cells , 2011, Seminars in Immunopathology.

[49]  John J. Peterson,et al.  Batch-to-Batch Variation: A Key Component for Modeling Chemical Manufacturing Processes , 2015 .

[50]  Derek Machalek,et al.  A novel implicit hybrid machine learning model and its application for reinforcement learning , 2021, Comput. Chem. Eng..

[51]  E. Feng,et al.  Modelling and optimal control for a fed-batch fermentation process , 2013 .

[52]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[53]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[54]  David Welch,et al.  Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems , 2009, Journal of The Royal Society Interface.

[55]  Wei Xie,et al.  Interpretable biomanufacturing process risk and sensitivity analyses for quality‐by‐design and stability control , 2019, Naval Research Logistics (NRL).

[56]  Simon J.A. Malham,et al.  An introduction to SDE simulation , 2010, 1004.0646.

[57]  Tugce Martagan,et al.  Optimizing Biomanufacturing Harvesting Decisions under Limited Historical Data , 2021, 2101.03735.

[58]  Kelly Wiltberger,et al.  Biopharmaceutical raw material variation and control , 2018, Current Opinion in Chemical Engineering.

[59]  S. Ramsey,et al.  Value and affordability of CAR T-cell therapy in the United States , 2020, Bone Marrow Transplantation.