Bidirectional Learning for Offline Model-based Biological Sequence Design

Offline model-based optimization aims to maximize a black-box objective function with a static dataset of designs and their scores. In this paper, we focus on biological sequence design to maximize some sequence score. A recent approach employs bidirectional learning, combining a forward mapping for exploitation and a backward mapping for constraint, and it relies on the neural tangent kernel (NTK) of an infinitely wide network to build a proxy model. Though effective, the NTK cannot learn features because of its parametrization, and its use prevents the incorporation of powerful pre-trained Language Models (LMs) that can capture the rich biophysical information in millions of biological sequences. We adopt an alternative proxy model, adding a linear head to a pre-trained LM, and propose a linearization scheme. This yields a closed-form loss and also takes into account the biophysical information in the pre-trained LM. In addition, the forward mapping and the backward mapping play different roles and thus deserve different weights during sequence optimization. To achieve this, we train an auxiliary model and leverage its weak supervision signal via a bi-level optimization framework to effectively learn how to balance the two mappings. Further, by extending the framework, we develop the first learning rate adaptation module \textit{Adaptive}-$\eta$, which is compatible with all gradient-based algorithms for offline model-based optimization. Experimental results on DNA/protein sequence design tasks verify the effectiveness of our algorithm. Our code is available~\href{https://anonymous.4open.science/r/BIB-ICLR2023-Submission/README.md}{here.}

[1]  Youhan Lee,et al.  ProtFIM: Fill-in-Middle Protein Sequence Design via Protein Language Models , 2023, ArXiv.

[2]  D. Duvenaud,et al.  On Implicit Bias in Overparameterized Bilevel Optimization , 2022, ICML.

[3]  Jie Fu,et al.  Bidirectional Learning for Offline Infinite-width Model-based Optimization , 2022, Neural Information Processing Systems.

[4]  Xiangshan Chen,et al.  Gradient-based Bi-level Optimization for Deep Learning: A Survey , 2022, ArXiv.

[5]  Jianzhu Ma,et al.  Proximal Exploration for Model-guided Protein Sequence Design , 2022, bioRxiv.

[6]  D. Dou,et al.  Structure-aware protein self-supervised learning , 2022, Bioinform..

[7]  Bonaventure F. P. Dossou,et al.  Biological Sequence Design with GFlowNets , 2022, ICML.

[8]  S. Levine,et al.  Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization , 2022, ICML.

[9]  Alan D. Lopez,et al.  Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis , 2022, The Lancet.

[10]  Stefano Soatto,et al.  DIVA: Dataset Derivative of a Learning Task , 2021, ICLR.

[11]  Dejing Dou,et al.  Generalized Data Weighting via Class-level Gradient Manipulation , 2021, NeurIPS.

[12]  Le Song,et al.  RoMA: Robust Model Adaptation for Offline Model-based Optimization , 2021, NeurIPS.

[13]  Georg Seelig,et al.  Fast activation maximization for molecular sequence design , 2021, BMC Bioinform..

[14]  Sergey Levine,et al.  Conservative Objective Models for Effective Offline Model-Based Optimization , 2021, ICML.

[15]  Ben Krause,et al.  Deep Extrapolation for Attribute-Enhanced Generation , 2021, NeurIPS.

[16]  Prof Vikas Kumar,et al.  Drug repurposing against SARS-CoV-2 receptor binding domain using ensemble-based virtual screening and molecular dynamics simulations , 2021, Computers in Biology and Medicine.

[17]  Yongxin Yang,et al.  EvoGrad: Efficient Gradient-Based Meta-Learning and Hyperparameter Optimization , 2021, NeurIPS.

[18]  David E. Kim,et al.  Protein sequence design by conformational landscape optimization , 2021, Proceedings of the National Academy of Sciences.

[19]  K. Tsuda,et al.  Black-Box Optimization for Automated Discovery. , 2021, Accounts of chemical research.

[20]  Sergey Levine,et al.  Offline Model-Based Optimization via Normalized Maximum Likelihood Estimation , 2021, ICLR.

[21]  Lucy J. Colwell,et al.  Deep diversification of an AAV capsid protein by machine learning , 2021, Nature Biotechnology.

[22]  Deyu Meng,et al.  Investigating Bi-Level Optimization for Learning and Vision From a Unified Perspective: A Survey and Beyond , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Stefano Soatto,et al.  LQF: Linear Quadratic Fine-Tuning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  David Baker,et al.  Design of proteins presenting discontinuous functional sites using deep learning , 2020, bioRxiv.

[25]  Richard Wang,et al.  AdaLead: A simple and robust adaptive greedy search algorithm for sequence design , 2020, ArXiv.

[26]  Zhihan Zhou,et al.  DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome , 2020, bioRxiv.

[27]  A. Storkey,et al.  Gradient-based Hyperparameter Optimization Over Long Horizons , 2020, NeurIPS.

[28]  B. Rost,et al.  ProtTrans: Towards Cracking the Language of Life’s Code Through Self-Supervised Deep Learning and High Performance Computing , 2020, bioRxiv.

[29]  Georg Seelig,et al.  Fast differentiable DNA and protein sequence optimization for molecular design , 2020, ArXiv.

[30]  David Dohan,et al.  Model-based reinforcement learning for biological sequence design , 2020, ICLR.

[31]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[32]  David Duvenaud,et al.  Optimizing Millions of Hyperparameters by Implicit Differentiation , 2019, AISTATS.

[33]  Paolo Frasconi,et al.  Marthe: Scheduling the Learning Rate Via Online Hypergradients , 2019, IJCAI.

[34]  Ruosong Wang,et al.  Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks , 2019, ICLR.

[35]  Jennifer Listgarten,et al.  Conditioning by adaptive sampling for robust design , 2019, ICML.

[36]  Paolo Frasconi,et al.  Bilevel Programming for Hyperparameter Optimization and Meta-Learning , 2018, ICML.

[37]  Xingrui Yu,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[38]  Brendan J. Frey,et al.  Generating and designing DNA with deep generative models , 2017, ArXiv.

[39]  Frank Hutter,et al.  The reparameterization trick for acquisition functions , 2017, ArXiv.

[40]  Mark W. Schmidt,et al.  Online Learning Rate Adaptation with Hypergradient Descent , 2017, ICLR.

[41]  Paolo Frasconi,et al.  Forward and Reverse Gradient-Based Hyperparameter Optimization , 2017, ICML.

[42]  Dmitry Chudakov,et al.  Local fitness landscape of the green fluorescent protein , 2016, Nature.

[43]  Jaie C. Woodard,et al.  Survey of variation in human transcription factors reveals prevalent DNA binding changes , 2016, Science.

[44]  Fabian Pedregosa,et al.  Hyperparameter optimization with approximate gradient , 2016, ICML.

[45]  Tapani Raiko,et al.  Scalable Gradient-Based Tuning of Continuous Regularization Hyperparameters , 2015, ICML.

[46]  Ryan P. Adams,et al.  Gradient-based Hyperparameter Optimization through Reversible Learning , 2015, ICML.

[47]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[48]  Joseph B Hiatt,et al.  Activity-enhancing mutations in an E3 ubiquitin ligase identified by high-throughput mutagenesis , 2013, Proceedings of the National Academy of Sciences.

[49]  Data-Driven Optimization for Protein Design: Workflows, Algorithms and Metrics , 2022 .

[50]  Christopher Beckham,et al.  Towards good validation metrics for generative models in offline model-based optimisation , 2022, ArXiv.

[51]  Edward J. Hu,et al.  Tensor Programs IV: Feature Learning in Infinite-Width Neural Networks , 2021, ICML.

[52]  Nikolaus Hansen,et al.  The CMA Evolution Strategy: A Comparing Review , 2006, Towards a New Evolutionary Computation.