Deep neural language modeling enables functional protein generation across families
暂无分享,去创建一个
Zachary Z. Sun | R. Socher | Ben Krause | Caiming Xiong | J. Holton | Ali Madani | J. Fraser | Subu Subramanian | J. L. Olmos | N. Naik | E. Greene | Benjamin P. Mohr
[1] Oriol Vinyals,et al. Highly accurate protein structure prediction with AlphaFold , 2021, Nature.
[2] Yi Yan Yang,et al. Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations , 2021, Nature Biomedical Engineering.
[3] Lucy J. Colwell,et al. Deep diversification of an AAV capsid protein by machine learning , 2021, Nature Biotechnology.
[4] M. Mirdita,et al. Fast and sensitive taxonomic assignment to metagenomic contigs , 2020, bioRxiv.
[5] Simona Cocco,et al. An evolution-based model for designing chorismate mutase enzymes , 2020, Science.
[6] Shana Poplack,et al. Sometimes I’ll start a sentence in Spanish Y TERMINO EN ESPAÑOL: toward a typology of code-switching1 , 1980 .
[7] B. Rost,et al. ProtTrans: Towards Cracking the Language of Life’s Code Through Self-Supervised Deep Learning and High Performance Computing , 2020, bioRxiv.
[8] F. Arnold,et al. Signal Peptides Generated by Attention-Based Neural Networks. , 2020, ACS synthetic biology.
[9] Lav R. Varshney,et al. BERTology Meets Biology: Interpreting Attention in Protein Language Models , 2020, bioRxiv.
[10] Soon Wen Hoh,et al. Current approaches for automated model building into cryo-EM maps using Buccaneer with CCP-EM , 2020, Acta crystallographica. Section D, Structural biology.
[11] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[12] Nikhil Naik,et al. ProGen: Language Modeling for Protein Generation , 2020, bioRxiv.
[13] Ethan C. Alley,et al. Low-N protein engineering with data-efficient deep learning , 2020, Nature Methods.
[14] Jianyi Yang,et al. Improved protein structure prediction using predicted interresidue orientations , 2019, Proceedings of the National Academy of Sciences.
[15] Aleksej Zelezniak,et al. Expanding functional protein sequence space using generative adversarial networks , 2019, bioRxiv.
[16] J. Yosinski,et al. Plug and Play Language Models: A Simple Approach to Controlled Text Generation , 2019, ICLR.
[17] Lav R. Varshney,et al. CTRL: A Conditional Transformer Language Model for Controllable Generation , 2019, ArXiv.
[18] Adam J. Riesselman,et al. Protein design and variant prediction using autoregressive generative models , 2019, Nature Communications.
[19] John Canny,et al. Evaluating Protein Transfer Learning with TAPE , 2019, bioRxiv.
[20] Thomas Wolf,et al. Transfer Learning in Natural Language Processing , 2019, NAACL.
[21] Ali Farhadi,et al. Defending Against Neural Fake News , 2019, NeurIPS.
[22] Myle Ott,et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences , 2019, Proceedings of the National Academy of Sciences.
[23] Yejin Choi,et al. The Curious Case of Neural Text Degeneration , 2019, ICLR.
[24] George M. Church,et al. Unified rational protein engineering with sequence-only deep representation learning , 2019, bioRxiv.
[25] A. Buckle,et al. Catalytic diversity and cell wall binding repeats in the phage‐encoded endolysins , 2018, Molecular microbiology.
[26] Robert P. Sheridan,et al. The EVcouplings Python framework for coevolutionary sequence analysis , 2018, bioRxiv.
[27] Fei Long,et al. Overview of refinement procedures within REFMAC5: utilizing data from different sources , 2018, Acta crystallographica. Section D, Structural biology.
[28] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[29] D. Baker,et al. The coming of age of de novo protein design , 2016, Nature.
[30] Alexei A. Efros,et al. What makes ImageNet good for transfer learning? , 2016, ArXiv.
[31] Robert A. Langan,et al. De novo design of protein homo-oligomers with modular hydrogen-bond network–mediated specificity , 2016, Science.
[32] D. Baker,et al. De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy , 2015, Nature chemical biology.
[33] D. Baker,et al. Control over overall shape and size in de novo designed proteins , 2015, Proceedings of the National Academy of Sciences.
[34] Debora S. Marks,et al. Inferring Pairwise Interactions from Biological Data Using Maximum-Entropy Probability Models , 2015, PLoS Comput. Biol..
[35] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[36] Deok-Soo Kim,et al. BetaCavityWeb: a webserver for molecular voids and channels , 2015, Nucleic Acids Res..
[37] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[38] D. Baker,et al. Robust and accurate prediction of residue–residue interactions across protein interfaces using evolutionary information , 2014, eLife.
[39] Richard M. Murray,et al. Protocols for Implementing an Escherichia coli Based TX-TL Cell-Free Expression System for Synthetic Biology , 2013, Journal of visualized experiments : JoVE.
[40] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[41] D. Baker,et al. Principles for designing ideal protein structures , 2012, Nature.
[42] P. Zwart,et al. Towards automated crystallographic structure refinement with phenix.refine , 2012, Acta crystallographica. Section D, Biological crystallography.
[43] Scott Federhen,et al. The NCBI Taxonomy database , 2011, Nucleic Acids Res..
[44] C. Sander,et al. Direct-coupling analysis of residue coevolution captures native contacts across many protein families , 2011, Proceedings of the National Academy of Sciences.
[45] Dale E Tronrud,et al. Lessons from the lysozyme of phage T4 , 2010, Protein science : a publication of the Protein Society.
[46] B. Matthews,et al. Evaluation at atomic resolution of the role of strain in destabilizing the temperature‐sensitive T4 lysozyme mutant Arg 96 → His , 2009, Protein science : a publication of the Protein Society.
[47] Randy J. Read,et al. Dauter Iterative model building , structure refinement and density modification with the PHENIX AutoBuild wizard , 2007 .
[48] E. Birney,et al. Pfam: the protein families database , 2013, Nucleic Acids Res..
[49] Randy J. Read,et al. Phaser crystallographic software , 2007, Journal of applied crystallography.
[50] Rolf Apweiler,et al. UniProt archive , 2004, Bioinform..
[51] Philipp Koehn,et al. Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.
[52] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[53] M. Ashburner,et al. Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.
[54] K. J. Oh,et al. Conformation of T4 lysozyme in solution. Hinge-bending motion and the substrate-induced conformational transition studied by site-directed spin labeling. , 1997, Biochemistry.
[55] B. Matthews,et al. A covalent enzyme-substrate intermediate with saccharide distortion in a mutant T4 lysozyme. , 1993, Science.
[56] Paul Martin,et al. POTTS MODELS AND RELATED PROBLEMS IN STATISTICAL MECHANICS , 1991 .
[57] Carol Pfaff. Constraints on Language Mixing: Intrasentential Code-Switching and Borrowing in Spanish/English , 1979 .
[58] Shana Poplack,et al. Sometimes I'll Start a Sentence in Spanish Y Termino En Espanol: toward a Typology of Code-switching 1 , 2010 .
[59] B. Matthews. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.
[60] R. Dobson,et al. On the catalytic mechanism of bacteriophage endolysins: Opportunities for engineering. , 2019, Biochimica et biophysica acta. Proteins and proteomics.
[61] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[62] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[63] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[64] P. Emsley,et al. Features and development of Coot , 2010, Acta crystallographica. Section D, Biological crystallography.
[65] Leslie D. Pettit,et al. The IUPAC stability constants database , 2006 .
[66] Cathy H. Wu,et al. The Universal Protein Resource (UniProt) , 2005, Nucleic Acids Res..