Embeddings from deep learning transfer GO annotations beyond homology
暂无分享,去创建一个
Burkhard Rost | Michael Heinzinger | Christian Dallago | Maria Littmann | Tobias Olenyi | B. Rost | M. Heinzinger | Christian Dallago | Maria Littmann | Tobias Olenyi
[1] Rachael P. Huntley,et al. The Gene Ontology Annotation (GOA) Database , 2009 .
[2] Michel Schneider,et al. UniProtKB/Swiss-Prot. , 2007, Methods in molecular biology.
[3] B. Rost,et al. Protein structures sustain evolutionary drift. , 1997, Folding & design.
[4] Johannes Söding,et al. MMseqs2: sensitive protein sequence searching for the analysis of massive data sets , 2017, bioRxiv.
[5] Burkhard Rost,et al. ProtTrans: Towards Cracking the Language of Life’s Code Through Self-Supervised Deep Learning and High Performance Computing , 2020, bioRxiv.
[6] Bosco K. Ho,et al. Systematic modeling of SARS-CoV-2 protein structures , 2020 .
[7] Tapio Salakoski,et al. The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens , 2019, Genome Biology.
[8] Johannes Söding,et al. Clustering huge protein sequence sets in linear time , 2018 .
[9] Burkhard Rost,et al. LocTree3 prediction of localization , 2014, Nucleic Acids Res..
[10] M. Kanehisa,et al. Prediction of protein function from sequence properties. Discriminant analysis of a data base. , 1984, Biochimica et biophysica acta.
[11] Zhengwei Zhu,et al. CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..
[12] The Gene Ontology Consortium,et al. The Gene Ontology Resource: 20 years and still GOing strong , 2018, Nucleic Acids Res..
[13] Yana Bromberg,et al. Computational prediction shines light on type III secretion origins , 2016, Scientific Reports.
[14] Björn W. Schuller,et al. Contextual Bidirectional Long Short-Term Memory Recurrent Neural Network Language Models: A Generative Approach to Sentiment Analysis , 2017, EACL.
[15] K. Nakai,et al. PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. , 1999, Trends in biochemical sciences.
[16] Marco Punta,et al. Beyond annotation transfer by homology: novel protein-function prediction methods to assist drug discovery. , 2005, Drug discovery today.
[17] C. Sander,et al. Yeast chromosome III: new gene functions. , 1994, The EMBO journal.
[18] Nadia El-Mabrouk,et al. ISMB 2020 proceedings , 2020, Bioinform..
[19] Chandra Bhagavatula,et al. Semi-supervised sequence tagging with bidirectional language models , 2017, ACL.
[20] Robert E. Schapire,et al. Hierarchical multi-label prediction of gene function , 2006, Bioinform..
[21] James B. Anderson,et al. Clonal evolution and genome stability in a 2,500-year-old fungal individual , 2018, bioRxiv.
[22] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[23] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[24] Holger Schwenk,et al. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.
[25] James C. Hu,et al. The Gene Ontology Resource: 20 years and still GOing strong , 2019 .
[26] Burkhard Rost,et al. Modeling aspects of the language of life through transfer-learning protein sequences , 2019, BMC Bioinformatics.
[27] Burkhard Rost,et al. Protein–Protein Interactions More Conserved within Species than across Species , 2006, PLoS Comput. Biol..
[28] Maxat Kulmanov,et al. DeepGOPlus: Improved protein function prediction from sequence , 2019 .
[29] Alexander M. Rush,et al. Character-Aware Neural Language Models , 2015, AAAI.
[30] B. Rost,et al. Automatic prediction of protein function , 2003, Cellular and Molecular Life Sciences CMLS.
[31] Johannes Söding,et al. Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold , 2018, Nature Methods.
[32] Christian Schaefer,et al. Homology-based inference sets the bar high for protein function prediction , 2013, BMC Bioinformatics.
[33] Jari Björne,et al. The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens , 2019, Genome Biology.
[34] K Nishikawa,et al. Correlation of the amino acid composition of a protein to its structural and biological characters. , 1982, Journal of biochemistry.
[35] Daniel W. A. Buchan,et al. A large-scale evaluation of computational protein function prediction , 2013, Nature Methods.
[36] Adam Godzik,et al. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..
[37] E. Zuckerkandl. Evolutionary processes and evolutionary noise at the molecular level , 1976, Journal of Molecular Evolution.
[38] M. Ashburner,et al. Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.
[39] B. Rost,et al. ProNA2020 predicts protein-DNA, protein-RNA and protein-protein binding proteins and residues from sequence. , 2020, Journal of molecular biology.
[40] Hannah Currant,et al. FFPred 3: feature-based function prediction for all Gene Ontology domains , 2016, Scientific Reports.
[41] Burkhard Rost,et al. Inferring sub-cellular localization through automated lexical analysis , 2002, ISMB.
[42] H. Krebs,et al. Metabolism of ketonic acids in animal tissues. , 1937, The Biochemical journal.
[43] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[44] Torsten Schwede,et al. SWISS-MODEL: homology modelling of protein structures and complexes , 2018, Nucleic Acids Res..
[45] Lav R. Varshney,et al. BERTology Meets Biology: Interpreting Attention in Protein Language Models , 2020, bioRxiv.
[46] Myle Ott,et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences , 2019, Proceedings of the National Academy of Sciences.
[47] Prudence Mutowo-Meullenet,et al. The GOA database: Gene Ontology annotation updates for 2015 , 2014, Nucleic Acids Res..
[48] B. Rost,et al. Adaptation of protein surfaces to subcellular location. , 1998, Journal of molecular biology.
[49] P. Radivojac,et al. Analysis of protein function and its prediction from amino acid sequence , 2011, Proteins.
[50] Predrag Radivojac,et al. Community-Wide Evaluation of Computational Function Prediction. , 2016, Methods in molecular biology.
[51] Peter E. Hart,et al. Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.
[52] Matt J. Kusner,et al. From Word Embeddings To Document Distances , 2015, ICML.
[53] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[54] Daisuke Kihara,et al. NaviGO: interactive tool for visualization and functional similarity and coherence analysis with gene ontology , 2017, BMC Bioinformatics.
[55] B. Rost. Twilight zone of protein sequence alignments. , 1999, Protein engineering.
[56] C. Spearman. The proof and measurement of association between two things. , 2015, International journal of epidemiology.
[57] H. Margalit,et al. Quantitative parameters for amino acid-base interaction: implications for prediction of protein-DNA binding sites. , 1998, Nucleic acids research.
[58] Jason Weston,et al. Mismatch string kernels for discriminative protein classification , 2004, Bioinform..
[59] Tapio Salakoski,et al. An expanded evaluation of protein function prediction methods shows an improvement in accuracy , 2016, Genome Biology.
[60] M J Sternberg,et al. Prediction of structural and functional features of protein and nucleic acid sequences by artificial neural networks. , 1992, Biochemistry.
[61] David Warde-Farley,et al. GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function , 2008, Genome Biology.
[62] Jason Weston,et al. Mismatch String Kernels for SVM Protein Classification , 2002, NIPS.
[63] Emily Dimmer,et al. The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology , 2004, Nucleic Acids Res..
[64] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[65] Burkhard Rost,et al. Sequence conserved for subcellular localization , 2002, Protein science : a publication of the Protein Society.
[66] Timothy M. Hospedales,et al. Analogies Explained: Towards Understanding Word Embeddings , 2019, ICML.
[67] Guoyin Wang,et al. Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms , 2018, ACL.
[68] The UniProt Consortium,et al. UniProt: a worldwide hub of protein knowledge , 2018, Nucleic Acids Res..
[69] Chee Keong Kwoh,et al. Structural analysis of the novel influenza A (H7N9) viral Neuraminidase interactions with current approved neuraminidase inhibitors Oseltamivir, Zanamivir, and Peramivir in the presence of mutation R289K , 2013, BMC Bioinformatics.
[70] Maxat Kulmanov,et al. DeepGOPlus: improved protein function prediction from sequence , 2019, bioRxiv.
[71] Emile Zuckerkandl,et al. Evolutionary processes and evolutionary noise at the molecular level , 1976, Journal of Molecular Evolution.
[72] M. O. Dayhoff,et al. Atlas of protein sequence and structure , 1965 .
[73] B. Rost. Enzyme function less conserved than anticipated. , 2002, Journal of molecular biology.
[74] J. Söding,et al. Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold , 2018, bioRxiv.
[75] T. Gaasterland,et al. Microbial genescapes: phyletic and functional patterns of ORF distribution among prokaryotes. , 1998, Microbial & comparative genomics.
[76] Burkhard Rost,et al. SARS-CoV-2 structural coverage map reveals state changes that disrupt host immunity , 2020 .
[77] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.