BERTology Meets Biology: Interpreting Attention in Protein Language Models
暂无分享,去创建一个
Lav R. Varshney | Richard Socher | Caiming Xiong | Nazneen Fatema Rajani | Jesse Vig | Ali Madani | R. Socher | Jesse Vig | Caiming Xiong | L. Varshney | Ali Madani | Nazneen Rajani
[1] O. Rosen,et al. Protein phosphorylation. , 1975, Annual review of biochemistry.
[2] M. Jackson. What do you mean? , 1989, Geriatric nursing.
[3] S. Henikoff,et al. Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.
[4] F. Arnold. Design by Directed Evolution , 1998 .
[5] T. N. Bhat,et al. The Protein Data Bank , 2000, Nucleic Acids Res..
[6] Chi‐Huey Wong,et al. HIV-1 protease: mechanism and drug discovery. , 2003, Organic & biomolecular chemistry.
[7] J. Tucker,et al. Protein engineering: security implications , 2006, EMBO reports.
[8] Haruki Nakamura,et al. Comprehensive structural classification of ligand-binding motifs in proteins. , 2008, Structure.
[9] R. Ankeny,et al. What’s so special about model organisms? , 2011 .
[10] A. Tramontano,et al. Critical assessment of methods of protein structure prediction (CASP)—round IX , 2011, Proteins.
[11] R. Ankeny,et al. Re-thinking organisms: The impact of databases on model organism biology. , 2012, Studies in history and philosophy of biological and biomedical sciences.
[12] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[13] Steven E. Brenner,et al. SCOPe: Structural Classification of Proteins—extended, integrating SCOP and ASTRAL data and classification of new structures , 2013, Nucleic Acids Res..
[14] Ehsaneddin Asgari,et al. ProtVec: A Continuous Distributed Representation of Biological Sequences , 2015, ArXiv.
[15] Peter B. McGarvey,et al. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches , 2014, Bioinform..
[16] Alexander S. Rose,et al. NGL Viewer: a web application for molecular visualization , 2015, Nucleic Acids Res..
[17] Sameer Singh,et al. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier , 2016, NAACL.
[18] Regina Barzilay,et al. Rationalizing Neural Predictions , 2016, EMNLP.
[19] Sara Veldhoen,et al. Diagnostic Classifiers Revealing how Neural Networks Process Hierarchical Structure , 2016, CoCo@NIPS.
[20] D. Baker,et al. The coming of age of de novo protein design , 2016, Nature.
[21] Julia M. Shifman,et al. Protein Engineering by Combined Computational and In Vitro Evolution Approaches. , 2016, Trends in biochemical sciences.
[22] Percy Liang,et al. Understanding Black-box Predictions via Influence Functions , 2017, ICML.
[23] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[24] Janez Konc,et al. Global organization of a binding site network gives insight into evolution and structure-function relationships of proteins , 2017, Scientific Reports.
[25] Been Kim,et al. Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.
[26] Tommi S. Jaakkola,et al. A causal framework for explaining the predictions of black-box sequence-to-sequence models , 2017, EMNLP.
[27] Yonatan Belinkov,et al. Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks , 2016, ICLR.
[28] Ankur Taly,et al. Axiomatic Attribution for Deep Networks , 2017, ICML.
[29] Martin Wattenberg,et al. SmoothGrad: removing noise by adding noise , 2017, ArXiv.
[30] Carlos Guestrin,et al. Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.
[31] Andreas Prlic,et al. NGL viewer: web‐based molecular graphics for large complexes , 2018, Bioinform..
[32] A. Tramontano,et al. Critical assessment of methods of protein structure prediction (CASP)—Round XII , 2018, Proteins.
[33] Ole Winther,et al. NetSurfP-2.0: improved prediction of protein structural features by integrated deep learning , 2018, bioRxiv.
[34] Shi Feng,et al. Pathologies of Neural Models Make Interpretations Difficult , 2018, EMNLP.
[35] Jörg Tiedemann,et al. An Analysis of Encoder Representations in Transformer-Based Machine Translation , 2018, BlackboxNLP@EMNLP.
[36] Zachary C. Lipton,et al. The mythos of model interpretability , 2018, Commun. ACM.
[37] Luke S. Zettlemoyer,et al. Dissecting Contextual Word Embeddings: Architecture and Representation , 2018, EMNLP.
[38] Guillaume Lample,et al. What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.
[39] Johannes Söding,et al. Clustering huge protein sequence sets in linear time , 2018 .
[40] Alexander S. Rose,et al. NGLview–interactive molecular graphics for Jupyter notebooks , 2018, Bioinform..
[41] Yi Chern Tan,et al. Assessing Social and Intersectional Biases in Contextualized Word Representations , 2019, NeurIPS.
[42] Fedor Moiseev,et al. Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned , 2019, ACL.
[43] Jesse Vig,et al. A Multiscale Visualization of Attention in the Transformer Model , 2019, ACL.
[44] Noah A. Smith,et al. Is Attention Interpretable? , 2019, ACL.
[45] Robert Frank,et al. Open Sesame: Getting inside BERT’s Linguistic Knowledge , 2019, BlackboxNLP@ACL.
[46] Ole Winther,et al. NetSurfP‐2.0: Improved prediction of protein structural features by integrated deep learning , 2019, Proteins.
[47] Myle Ott,et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences , 2019, Proceedings of the National Academy of Sciences.
[48] Adam J. Riesselman,et al. Protein design and variant prediction using autoregressive generative models , 2019, Nature Communications.
[49] Mohammed AlQuraishi,et al. ProteinNet: a standardized data set for machine learning of protein structure , 2019, BMC Bioinformatics.
[50] Byron C. Wallace,et al. Attention is not Explanation , 2019, NAACL.
[51] Benoît Sagot,et al. What Does BERT Learn about the Structure of Language? , 2019, ACL.
[52] Bonnie Berger,et al. Learning protein sequence embeddings using information from structure , 2019, ICLR.
[53] Manaal Faruqui,et al. Attention Interpretability Across NLP Tasks , 2019, ArXiv.
[54] Lav R. Varshney,et al. Pretrained AI Models: Performativity, Mobility, and Change , 2019, ArXiv.
[55] Christopher D. Manning,et al. A Structural Probe for Finding Syntax in Word Representations , 2019, NAACL.
[56] Kees van Deemter,et al. What do you mean, BERT? Assessing BERT as a Distributional Semantics Model , 2019, ArXiv.
[57] Anoop Sarkar,et al. Interrogating the Explanatory Power of Attention in Neural Machine Translation , 2019, EMNLP.
[58] George M. Church,et al. Unified rational protein engineering with sequence-only deep representation learning , 2019, bioRxiv.
[59] Silvio C. E. Tosatto,et al. The Pfam protein families database in 2019 , 2018, Nucleic Acids Res..
[60] Richard Socher,et al. Explain Yourself! Leveraging Language Models for Commonsense Reasoning , 2019, ACL.
[61] George M. Church,et al. Unified rational protein engineering with sequence-based deep representation learning , 2019, Nature Methods.
[62] Ivan Titov,et al. Interpretable Neural Predictions with Differentiable Binary Variables , 2019, ACL.
[63] Omer Levy,et al. What Does BERT Look at? An Analysis of BERT’s Attention , 2019, BlackboxNLP@ACL.
[64] Dipanjan Das,et al. BERT Rediscovers the Classical NLP Pipeline , 2019, ACL.
[65] Debora S. Marks,et al. Accelerating Protein Design Using Autoregressive Generative Models , 2019, bioRxiv.
[66] Kawin Ethayarajh,et al. How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings , 2019, EMNLP.
[67] Yonatan Belinkov,et al. Analyzing the Structure of Attention in a Transformer Language Model , 2019, BlackboxNLP@ACL.
[68] Alan W Black,et al. Measuring Bias in Contextualized Word Representations , 2019, Proceedings of the First Workshop on Gender Bias in Natural Language Processing.
[69] Regina Barzilay,et al. Generative Models for Graph-Based Protein Design , 2019, DGS@ICLR.
[70] Martin Wattenberg,et al. Visualizing and Measuring the Geometry of BERT , 2019, NeurIPS.
[71] Avishek Anand,et al. EXS: Explainable Search Using Local Model Agnostic Interpretability , 2018, WSDM.
[72] Shikha Bordia,et al. Do Attention Heads in BERT Track Syntactic Dependencies? , 2019, ArXiv.
[73] Gregor Wiedemann,et al. Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Embeddings , 2019, KONVENS.
[74] Yuval Pinter,et al. Attention is not not Explanation , 2019, EMNLP.
[75] Anna Rumshisky,et al. Revealing the Dark Secrets of BERT , 2019, EMNLP.
[76] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[77] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[78] C. Sander,et al. Inferring protein 3D structure from deep mutation scans , 2019, Nature Genetics.
[79] John Canny,et al. Evaluating Protein Transfer Learning with TAPE , 2019, bioRxiv.
[80] Yoav Goldberg,et al. Assessing BERT's Syntactic Abilities , 2019, ArXiv.
[81] Hung-Yu Kao,et al. Probing Neural Network Comprehension of Natural Language Arguments , 2019, ACL.
[82] André F. T. Martins,et al. Adaptively Sparse Transformers , 2019, EMNLP.
[83] Yonatan Belinkov,et al. Linguistic Knowledge and Transferability of Contextual Representations , 2019, NAACL.
[84] Malay Kumar Basu,et al. Grammar of protein domain architectures , 2019, Proceedings of the National Academy of Sciences.
[85] Kathleen McKeown,et al. Fine-grained Sentiment Analysis with Faithful Attention , 2019, ArXiv.
[86] Yonatan Belinkov,et al. Causal Mediation Analysis for Interpreting Neural NLP: The Case of Gender Bias , 2020, ArXiv.
[87] Sebastian Gehrmann,et al. exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformers Models , 2019, ArXiv.
[88] Shrey Desai,et al. Calibration of Pre-trained Transformers , 2020, EMNLP.
[89] Unsupervised Attention-Guided Atom-Mapping , 2020 .
[90] Yonatan Belinkov,et al. Investigating Gender Bias in Language Models Using Causal Mediation Analysis , 2020, NeurIPS.
[91] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[92] Elijah Mayfield,et al. Why Attention is Not Explanation: Surgical Intervention and Causal Reasoning about Neural Models , 2020, LREC.
[93] Ethan C. Alley,et al. Low-N protein engineering with data-efficient deep learning , 2020, Nature Methods.
[94] On Identifiability in Transformers , 2019, ICLR.
[95] Allyson Ettinger,et al. What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models , 2019, TACL.
[96] Colin Raffel,et al. WT5?! Training Text-to-Text Models to Explain their Predictions , 2020, ArXiv.
[97] Anna Rumshisky,et al. A Primer in BERTology: What We Know About How BERT Works , 2020, Transactions of the Association for Computational Linguistics.
[98] Nikhil Naik,et al. ProGen: Language Modeling for Protein Generation , 2020, bioRxiv.
[99] Zachary Chase Lipton,et al. Learning to Deceive with Attention-Based Explanations , 2019, ACL.
[100] Mario Srouji,et al. BERT Learns (and Teaches) Chemistry , 2020, ArXiv.
[101] Cho-Jui Hsieh,et al. Multi-Stage Influence Function , 2020, NeurIPS.
[102] B. Rost,et al. ProtTrans: Towards Cracking the Language of Lifes Code Through Self-Supervised Deep Learning and High Performance Computing. , 2021, IEEE transactions on pattern analysis and machine intelligence.
[103] Tom Sercu,et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences , 2021, Proceedings of the National Academy of Sciences.