Large scale active-learning-guided exploration for in vitro protein production optimization

Lysate-based cell-free systems have become a major platform to study gene expression but batch-to-batch variation makes protein production difficult to predict. Here we describe an active learning approach to explore a combinatorial space of ~4,000,000 cell-free buffer compositions, maximizing protein production and identifying critical parameters involved in cell-free productivity. We also provide a one-step-method to achieve high quality predictions for protein production using minimal experimental effort regardless of the lysate quality. Cell-free lysates are a major platform for in vitro protein production but batch-to-batch variation makes production difficult to predict. Here the authors develop an active learning approach to optimising buffer conditions to bring homemade lysates up to commercial production potential.

[1]  M. Jewett,et al.  Cell-free synthetic biology: thinking outside the cell. , 2012, Metabolic engineering.

[2]  Diogo M. Camacho,et al.  Next-Generation Machine Learning for Biological Networks , 2018, Cell.

[3]  Michael C. Jewett,et al.  Protein synthesis by ribosomes with tethered subunits , 2015, Nature.

[4]  R. Bar-Ziv,et al.  Principles of cell-free genetic circuit assembly , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[5]  G. Church,et al.  Establishing a Cell-Free Vibrio natriegens Expression System. , 2018, ACS synthetic biology.

[6]  Paul S. Freemont,et al.  Rapid acquisition and model-based analysis of cell-free transcription–translation reactions from nonmodel bacteria , 2018, Proceedings of the National Academy of Sciences.

[7]  Tom Ellis,et al.  Cell-free prediction of protein expression costs for growing cells , 2017, Nature Communications.

[8]  Peter L. Voyvodic,et al.  Metabolic perceptrons for neural computing in biological systems , 2019, Nature Communications.

[9]  Carole Goble,et al.  An automated Design-Build-Test-Learn pipeline for enhanced microbial production of fine chemicals , 2018, Communications Biology.

[10]  Ashty S Karim,et al.  High-Throughput Optimization Cycle of a Cell-Free Ribosome Assembly and Protein Synthesis System. , 2018, ACS synthetic biology.

[11]  Christopher H. Bryant,et al.  Functional genomic hypothesis generation and experimentation by a robot scientist , 2004, Nature.

[12]  Zachary Z. Sun,et al.  Characterizing and prototyping genetic networks with cell-free transcription-translation reactions. , 2015, Methods.

[13]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[14]  Scott A. Walper,et al.  Quantification of Interlaboratory Cell-free Protein Synthesis Variability. , 2019, ACS synthetic biology.

[15]  G. Stephanopoulos,et al.  Improving Metabolic Pathway Efficiency by Statistical Model-Based Multivariate Regulatory Metabolic Engineering. , 2017, ACS synthetic biology.

[16]  Kim A Woodrow,et al.  Cell-free protein synthesis with prokaryotic combined transcription-translation. , 2004, Methods in molecular biology.

[17]  Gisbert Schneider,et al.  Active-learning strategies in computer-assisted drug discovery. , 2015, Drug discovery today.

[18]  P. Freemont,et al.  Development of a Bacillus subtilis cell-free transcription-translation system for prototyping regulatory elements. , 2016, Metabolic engineering.

[19]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[20]  Vincent Noireaux,et al.  Coarse-grained dynamics of protein synthesis in a cell-free system. , 2011, Physical review letters.

[21]  Jaime G. Carbonell,et al.  Active machine learning for transmembrane helix prediction , 2010, BMC Bioinformatics.

[22]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Richard M. Murray,et al.  Protocols for Implementing an Escherichia coli Based TX-TL Cell-Free Expression System for Synthetic Biology , 2013, Journal of visualized experiments : JoVE.

[24]  Vincent Noireaux,et al.  Linear DNA for rapid prototyping of synthetic biological circuits in an Escherichia coli based TX-TL cell-free system. , 2014, ACS synthetic biology.

[25]  Devin P Sullivan,et al.  Active machine learning-driven experimentation to determine compound effects on protein patterns , 2016, eLife.

[26]  Vincent Noireaux,et al.  Programmable on-chip DNA compartments as artificial cells , 2014, Science.

[27]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[28]  D. Hanahan,et al.  Differential plasmid rescue from transgenic mouse DNAs into Escherichia coli methylation-restriction mutants. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Michael C. Jewett,et al.  Cell‐Free Protein Synthesis: An Emerging Technology for Understanding, Harnessing, and Expanding the Capabilities of Biological Systems , 2018 .