Large language models generate functional protein sequences across diverse families
暂无分享,去创建一个
R. Socher | Ben Krause | Caiming Xiong | J. Holton | Ali Madani | J. Fraser | Subu Subramanian | J. L. Olmos | N. Naik | E. Greene | Benjamin P. Mohr | Zachary Z Sun
[1] B. Höcker,et al. ProtGPT2 is a deep unsupervised language model for protein design , 2022, Nature Communications.
[2] S. Liao,et al. A backbone-centred energy function of neural networks for protein design , 2022, Nature.
[3] S. Ovchinnikov,et al. ColabFold: making protein folding accessible to all , 2022, Nature Methods.
[4] Oriol Vinyals,et al. Highly accurate protein structure prediction with AlphaFold , 2021, Nature.
[5] David E. Kim,et al. Protein sequence design by conformational landscape optimization , 2021, Proceedings of the National Academy of Sciences.
[6] Yi Yan Yang,et al. Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations , 2021, Nature Biomedical Engineering.
[7] Lucy J. Colwell,et al. Deep diversification of an AAV capsid protein by machine learning , 2021, Nature Biotechnology.
[8] Ethan C. Alley,et al. Low-N protein engineering with data-efficient deep learning , 2020, Nature Methods.
[9] Aleksej Zelezniak,et al. Expanding functional protein sequence space using generative adversarial networks , 2019, bioRxiv.
[10] Adam J. Riesselman,et al. Protein design and variant prediction using autoregressive generative models , 2019, Nature Communications.
[11] Myle Ott,et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences , 2019, Proceedings of the National Academy of Sciences.
[12] M. Mirdita,et al. Fast and sensitive taxonomic assignment to metagenomic contigs , 2020, bioRxiv.
[13] Simona Cocco,et al. An evolution-based model for designing chorismate mutase enzymes , 2020, Science.
[14] David Baker,et al. De novo protein design by deep network hallucination , 2020, Nature.
[15] F. Arnold,et al. Signal Peptides Generated by Attention-Based Neural Networks. , 2020, ACS synthetic biology.
[16] Po-Ssu Huang,et al. Protein sequence design with a learned potential , 2020, bioRxiv.
[17] George M. Church,et al. Unified rational protein engineering with sequence-only deep representation learning , 2019, bioRxiv.
[18] R. Dobson,et al. On the catalytic mechanism of bacteriophage endolysins: Opportunities for engineering. , 2019, Biochimica et biophysica acta. Proteins and proteomics.
[19] A. Buckle,et al. Catalytic diversity and cell wall binding repeats in the phage‐encoded endolysins , 2018, Molecular microbiology.
[20] Robert P. Sheridan,et al. The EVcouplings Python framework for coevolutionary sequence analysis , 2018, bioRxiv.
[21] D. Baker,et al. The coming of age of de novo protein design , 2016, Nature.
[22] Robert A. Langan,et al. De novo design of protein homo-oligomers with modular hydrogen-bond network–mediated specificity , 2016, Science.
[23] D. Baker,et al. De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy , 2015, Nature chemical biology.
[24] D. Baker,et al. Control over overall shape and size in de novo designed proteins , 2015, Proceedings of the National Academy of Sciences.
[25] Debora S. Marks,et al. Inferring Pairwise Interactions from Biological Data Using Maximum-Entropy Probability Models , 2015, PLoS Comput. Biol..
[26] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[27] Deok-Soo Kim,et al. BetaCavityWeb: a webserver for molecular voids and channels , 2015, Nucleic Acids Res..
[28] D. Baker,et al. Principles for designing ideal protein structures , 2012, Nature.
[29] Scott Federhen,et al. The NCBI Taxonomy database , 2011, Nucleic Acids Res..
[30] C. Sander,et al. Direct-coupling analysis of residue coevolution captures native contacts across many protein families , 2011, Proceedings of the National Academy of Sciences.
[31] Sivaraman Balakrishnan,et al. Learning generative models for protein fold families , 2011, Proteins.
[32] Dale E Tronrud,et al. Lessons from the lysozyme of phage T4 , 2010, Protein science : a publication of the Protein Society.
[33] B. Matthews,et al. Evaluation at atomic resolution of the role of strain in destabilizing the temperature‐sensitive T4 lysozyme mutant Arg 96 → His , 2009, Protein science : a publication of the Protein Society.
[34] T. Hwa,et al. Identification of direct residue contacts in protein–protein interaction by message passing , 2009, Proceedings of the National Academy of Sciences.
[35] E. Birney,et al. Pfam: the protein families database , 2013, Nucleic Acids Res..
[36] F. Studier,et al. Protein production by auto-induction in high density shaking cultures. , 2005, Protein expression and purification.
[37] Cathy H. Wu,et al. The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..
[38] Rolf Apweiler,et al. UniProt archive , 2004, Bioinform..
[39] M. Ashburner,et al. Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.
[40] B. Rost. Twilight zone of protein sequence alignments. , 1999, Protein engineering.
[41] K. J. Oh,et al. Conformation of T4 lysozyme in solution. Hinge-bending motion and the substrate-induced conformational transition studied by site-directed spin labeling. , 1997, Biochemistry.
[42] B. Matthews,et al. A covalent enzyme-substrate intermediate with saccharide distortion in a mutant T4 lysozyme. , 1993, Science.
[43] Paul Martin,et al. POTTS MODELS AND RELATED PROBLEMS IN STATISTICAL MECHANICS , 1991 .
[44] Shana Poplack,et al. Sometimes I’ll start a sentence in Spanish Y TERMINO EN ESPAÑOL: toward a typology of code-switching1 , 1980 .
[45] Carol Pfaff. Constraints on Language Mixing: Intrasentential Code-Switching and Borrowing in Spanish/English , 1979 .
[46] B. Matthews. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.