Exploring Chemical Space using Natural Language Processing Methodologies for Drug Discovery
暂无分享,去创建一个
Arzucan Özgür | Hakime Öztürk | Teodoro Laino | Elif Ozkirimli | Philippe Schwaller | P. Schwaller | T. Laino | Arzucan Özgür | Elif Ozkirimli | Hakime Öztürk
[1] M. Prunotto,et al. Opportunities and challenges in phenotypic drug discovery: an industry perspective , 2017, Nature Reviews Drug Discovery.
[2] Frank Noé,et al. Learning Continuous and Data-Driven Molecular Descriptors by Translating Equivalent Chemical Representations , 2018 .
[3] Hinrich Schütze,et al. Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.
[4] Gisbert Schneider,et al. Tuning artificial intelligence on the de novo design of natural-product-inspired retinoid X receptor modulators , 2018, Communications Chemistry.
[5] A. Cheema,et al. Small Changes Huge Impact: The Role of Protein Posttranslational Modifications in Cellular Homeostasis and Disease , 2011, Journal of amino acids.
[6] Brian K. Shoichet,et al. ZINC - A Free Database of Commercially Available Compounds for Virtual Screening , 2005, J. Chem. Inf. Model..
[7] Stephen Dunn. Smiles , 1932 .
[8] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[9] David Rogers,et al. Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..
[10] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[11] Wei-keng Liao,et al. CheMixNet: Mixed DNN Architectures for Predicting Chemical Properties using Multiple Molecular Representations , 2018, ArXiv.
[12] Junzhou Huang,et al. SMILES-BERT: Large Scale Unsupervised Pre-Training for Molecular Property Prediction , 2019, BCB.
[13] Ajay N. Jain,et al. Effects of inductive bias on computational evaluations of ligand-based modeling and on drug discovery , 2008, J. Comput. Aided Mol. Des..
[14] Jin Woo Kim,et al. Molecular generative model based on conditional variational autoencoder for de novo molecular design , 2018, Journal of Cheminformatics.
[15] Mirella Lapata,et al. Text Generation from Knowledge Graphs with Graph Transformers , 2019, NAACL.
[16] Yan Wang,et al. DNN-Dom: predicting protein domain boundary from sequence alone by deep neural network , 2019, Bioinform..
[17] Sabrina Jaeger,et al. Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition , 2018, J. Chem. Inf. Model..
[18] P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .
[19] Jun Cheng,et al. The Kipoi repository accelerates community exchange and reuse of predictive models for genomics , 2019, Nature Biotechnology.
[20] Sungroh Yoon,et al. How Generative Adversarial Networks and Their Variants Work , 2017, ACM Comput. Surv..
[21] Stephen R. Heller,et al. InChI - the worldwide chemical structure identifier standard , 2013, Journal of Cheminformatics.
[22] Marwin H. S. Segler,et al. GuacaMol: Benchmarking Models for De Novo Molecular Design , 2018, J. Chem. Inf. Model..
[23] Arzucan Özgür,et al. ChemBoost: A Chemical Language Based Approach for Protein – Ligand Binding Affinity Prediction , 2018, Molecular informatics.
[24] Juno Nam,et al. Linking the Neural Machine Translation and the Prediction of Organic Chemistry Reactions , 2016, ArXiv.
[25] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.
[26] C. Anfinsen. Principles that govern the folding of protein chains. , 1973, Science.
[27] Amedeo Caflisch,et al. Protein structure-based drug design: from docking to molecular dynamics. , 2018, Current opinion in structural biology.
[28] Wei Chen,et al. Predicting protein structural classes for low-similarity sequences by evaluating different features , 2019, Knowl. Based Syst..
[29] Günter Klambauer,et al. DeepTox: Toxicity Prediction using Deep Learning , 2016, Front. Environ. Sci..
[30] Gerard Salton,et al. A vector space model for automatic indexing , 1975, CACM.
[31] Malay Kumar Basu,et al. Grammar of protein domain architectures , 2019, Proceedings of the National Academy of Sciences.
[32] Xing Gao,et al. Enhanced Protein Fold Prediction Method Through a Novel Feature Extraction Technique , 2015, IEEE Transactions on NanoBioscience.
[33] Akshay Deepak,et al. Deep Robust Framework for Protein Function Prediction Using Variable-Length Protein Sequences , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.
[34] John B. Shoven,et al. I , Edinburgh Medical and Surgical Journal.
[35] Constantine Bekas,et al. “Found in Translation”: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models† †Electronic supplementary information (ESI) available: Time-split test set and example predictions, together with attention weights, confidence and token probabilities. See DO , 2017, Chemical science.
[36] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[37] Russ B Altman,et al. Machine learning in chemoinformatics and drug discovery. , 2018, Drug discovery today.
[38] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[39] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[40] Stephen R. Heller,et al. InChI, the IUPAC International Chemical Identifier , 2015, Journal of Cheminformatics.
[41] Samy Bengio,et al. Generating Sentences from a Continuous Space , 2015, CoNLL.
[42] Thierry Kogej,et al. Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ACS central science.
[43] Matt J. Kusner,et al. Grammar Variational Autoencoder , 2017, ICML.
[44] Lantao Yu,et al. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.
[45] A. Valencia,et al. Information Retrieval and Text Mining Technologies for Chemistry. , 2017, Chemical reviews.
[46] Arzucan Özgür,et al. A comparative study of SMILES-based compound similarity functions for drug-target interaction prediction , 2016, BMC Bioinformatics.
[47] Alán Aspuru-Guzik,et al. Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models , 2017, ArXiv.
[48] Shashi Narayan,et al. Leveraging Pre-trained Checkpoints for Sequence Generation Tasks , 2019, Transactions of the Association for Computational Linguistics.
[49] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[50] Guoyin Wang,et al. Topic-Guided Variational Auto-Encoder for Text Generation , 2019, NAACL.
[51] Sutanu Chakraborti,et al. Protein Word Detection using Text Segmentation Techniques , 2017, BioNLP.
[52] Christus,et al. A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .
[53] Arzucan Özgür,et al. DeepDTA: deep drug–target binding affinity prediction , 2018, Bioinform..
[54] Jie Hou,et al. DeepSF: deep convolutional neural network for mapping protein sequences to folds , 2017, Bioinform..
[55] T. N. Bhat,et al. The Protein Data Bank , 2000, Nucleic Acids Res..
[56] Christian Biemann,et al. What do we need to build explainable AI systems for the medical domain? , 2017, ArXiv.
[57] David S. Wishart,et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..
[58] Steven Skiena,et al. Syntax-Directed Variational Autoencoder for Molecule Generation , 2017 .
[59] Ola Engkvist,et al. A de novo molecular generation method using latent vector based generative adversarial network , 2019, J. Cheminformatics.
[60] Gisbert Schneider,et al. De Novo Design of Bioactive Small Molecules by Artificial Intelligence , 2018, Molecular informatics.
[61] George Kingsley Zipf,et al. Human behavior and the principle of least effort , 1949 .
[62] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[63] Ola Engkvist,et al. Randomized SMILES strings improve the quality of molecular generative models , 2019, Journal of Cheminformatics.
[64] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[65] Michael Levitt,et al. The language of the protein universe. , 2015, Current opinion in genetics & development.
[66] Daniel C. Elton,et al. Deep learning for molecular generation and optimization - a review of the state of the art , 2019, Molecular Systems Design & Engineering.
[67] Yanli Wang,et al. PubChem: Integrated Platform of Small Molecules and Biological Activities , 2008 .
[68] Zhangxin Chen,et al. ProLanGO: Protein Function Prediction Using Neural Machine Translation Based on a Recurrent Neural Network , 2017, Molecules.
[69] Matthias Rarey,et al. On the Art of Compiling and Using 'Drug‐Like' Chemical Fragment Spaces , 2008, ChemMedChem.
[70] Abhinav Vishnu,et al. SMILES2Vec: An Interpretable General-Purpose Deep Neural Network for Predicting Chemical Properties , 2017, ArXiv.
[71] David Weininger,et al. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..
[72] Raymond J. Mooney,et al. Adaptive duplicate detection using learnable string similarity measures , 2003, KDD '03.
[73] Esben Jannik Bjerrum,et al. SMILES Enumeration as Data Augmentation for Neural Network Modeling of Molecules , 2017, ArXiv.
[74] Alán Aspuru-Guzik,et al. Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models , 2018, Frontiers in Pharmacology.
[75] Kyunghyun Cho,et al. Conditional molecular design with deep generative models , 2018, J. Chem. Inf. Model..
[76] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[77] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[78] Olexandr Isayev,et al. Deep reinforcement learning for de novo drug design , 2017, Science Advances.
[79] Fabrício F. Costa,et al. Rare genetic diseases: update on diagnosis, treatment and online resources. , 2018, Drug discovery today.
[80] Andrew R. Leach,et al. Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery , 2019, Journal of Cheminformatics.
[81] Connor W. Coley,et al. A graph-convolutional neural network model for the prediction of chemical reactivity , 2018, Chemical science.
[82] Koji Tsuda,et al. ChemTS: an efficient python library for de novo molecular generation , 2017, Science and technology of advanced materials.
[83] Gisbert Schneider,et al. Scaffold hopping from natural products to synthetic mimetics by holistic molecular similarity , 2018, Communications Chemistry.
[84] David Vidal,et al. LINGO, an Efficient Holographic Text Based Method To Calculate Biophysical Properties and Intermolecular Similarities , 2005, J. Chem. Inf. Model..
[85] James G. Nourse,et al. Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..
[86] Nicola De Cao,et al. MolGAN: An implicit generative model for small molecular graphs , 2018, ArXiv.
[87] Mario Gimona,et al. Protein linguistics — a grammar for modular protein assembly? , 2006, Nature Reviews Molecular Cell Biology.
[88] Andrea Cadeddu,et al. Organic chemistry as a language and the implications of chemical linguistics for structural and retrosynthetic analyses. , 2014, Angewandte Chemie.
[89] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[90] Abhinav Vishnu,et al. Chemception: A Deep Neural Network with Minimal Chemistry Knowledge Matches the Performance of Expert-developed QSAR/QSPR Models , 2017, ArXiv.
[91] Sebastian Ruder,et al. Neural transfer learning for natural language processing , 2019 .
[92] Maciej Eder,et al. Linguistic measures of chemical diversity and the “keywords” of molecular collections , 2018, Scientific Reports.
[93] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[94] Lorenz C. Blum,et al. 970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. , 2009, Journal of the American Chemical Society.
[95] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[96] Suman K. Chakravarti,et al. Distributed Representation of Chemical Fragments , 2018, ACS omega.
[97] Matt J. Kusner,et al. A Model to Search for Synthesizable Molecules , 2019, NeurIPS.
[98] Ehsaneddin Asgari,et al. Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics , 2015, PloS one.
[99] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[100] Karen Spärck Jones. A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.
[101] Plamen Angelov,et al. RetroTransformDB: A Dataset of Generic Transforms for Retrosynthetic Analysis , 2018, Data.
[102] Anshul Kundaje,et al. Prediction of protein-ligand interactions from paired protein sequence motifs and ligand substructures , 2018, PSB.
[103] Petra Schneider,et al. De Novo Design at the Edge of Chaos. , 2016, Journal of medicinal chemistry.
[104] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[105] Morikazu Nakamura,et al. Word Decoding of Protein Amino Acid Sequences with Availability Analysis: A Linguistic Approach , 2012, PLoS ONE.
[106] E. GARFIELD. Chemico-Linguistics: Computer Translation of Chemical Nomenclature , 1961, Nature.
[107] Yaoqi Zhou,et al. Getting to Know Your Neighbor: Protein Structure Prediction Comes of Age with Contextual Machine Learning , 2020, J. Comput. Biol..
[108] Igor V. Tetko,et al. Synergy Effect between Convolutional Neural Networks and the Multiplicity of SMILES for Improvement of Molecular Prediction , 2018, ArXiv.
[109] Riccardo Petraglia,et al. Predicting retrosynthetic pathways using a combined linguistic model and hyper-graph exploration strategy , 2019 .
[110] Friedrich Rippmann,et al. Interpretable Deep Learning in Drug Discovery , 2019, Explainable AI.
[111] Matthias Rarey,et al. In Need of Bias Control: Evaluating Chemical Data for Machine Learning in Structure-Based Virtual Screening , 2019, J. Chem. Inf. Model..
[112] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.
[113] Amos Bairoch,et al. The PROSITE database , 2005, Nucleic Acids Res..
[114] Pascal Friederich,et al. Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation , 2019, Mach. Learn. Sci. Technol..
[115] Sungroh Yoon,et al. DeepCCI: End-to-end Deep Learning for Chemical-Chemical Interaction Prediction , 2017, BCB.
[116] Regina Barzilay,et al. Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network , 2017, NIPS.
[117] Alán Aspuru-Guzik,et al. SELFIES: a robust representation of semantically constrained graphs with an example application in chemistry , 2019, ArXiv.
[118] Alán Aspuru-Guzik,et al. Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.
[119] Mathieu d'Aquin,et al. Leveraging Ontologies for Knowledge Graph Schemas , 2019, KGB@ESWC.
[120] Jacob D. Durrant,et al. Dimorphite-DL: an open-source program for enumerating the ionization states of drug-like small molecules , 2019, Journal of Cheminformatics.
[121] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[122] Xin Wen,et al. BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities , 2006, Nucleic Acids Res..
[123] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[124] Petra Schneider,et al. Generative Recurrent Networks for De Novo Drug Design , 2017, Molecular informatics.
[125] Jürgen Bajorath,et al. Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches. , 2007, Drug discovery today.
[126] Tsuyoshi Murata,et al. {m , 1934, ACML.
[127] Dongsup Kim,et al. FP2VEC: a new molecular featurizer for learning molecular properties , 2019, Bioinform..
[128] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[129] Gerhard Weikum,et al. KnowLife: a versatile approach for constructing a large knowledge graph for biomedical sciences , 2015, BMC Bioinformatics.
[130] Lisa Peltason,et al. Molecular Similarity Analysis in Virtual Screening , 2009 .
[131] Arzucan Özgür,et al. A novel methodology on distributed representations of proteins using their interacting ligands , 2018, Bioinform..
[132] Thomas Blaschke,et al. Molecular de-novo design through deep reinforcement learning , 2017, Journal of Cheminformatics.
[133] Jean-Louis Reymond,et al. SMIfp (SMILES fingerprint) Chemical Space for Virtual Screening and Visualization of Large Databases of Organic Molecules , 2013, J. Chem. Inf. Model..
[134] Eric J. Martin,et al. In silico generation of novel, drug-like chemical matter using the LSTM neural network , 2017, ArXiv.
[135] Michael M. Hann,et al. RECAP-Retrosynthetic Combinatorial Analysis Procedure: A Powerful New Technique for Identifying Privileged Molecular Fragments with Useful Applications in Combinatorial Chemistry , 1998, J. Chem. Inf. Comput. Sci..
[136] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[137] Gisbert Schneider,et al. Automating drug discovery , 2017, Nature Reviews Drug Discovery.
[138] Barbara J. Grosz,et al. Natural-Language Processing , 1982, Artificial Intelligence.
[139] Patrick Pantel,et al. From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..
[140] Alpha A. Lee,et al. Bayesian semi-supervised learning for uncertainty-calibrated prediction of molecular properties and active learning , 2019, Chemical science.
[141] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[142] Alán Aspuru-Guzik,et al. Neural Networks for the Prediction of Organic Chemistry Reactions , 2016, ACS central science.
[143] John P. Overington,et al. ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..
[144] Thomas Blaschke,et al. Exploring the GDB-13 chemical space using deep generative models , 2018, Journal of Cheminformatics.
[145] Cathy H. Wu,et al. UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..
[146] L. Holm,et al. The Pfam protein families database , 2005, Nucleic Acids Res..
[147] Matthieu J. Miossec,et al. Integration of target discovery, drug discovery and drug delivery: A review on computational strategies. , 2019, Wiley interdisciplinary reviews. Nanomedicine and nanobiotechnology.
[148] Joseph Gomes,et al. MoleculeNet: a benchmark for molecular machine learning† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c7sc02664a , 2017, Chemical science.
[149] Eugene I Shakhnovich,et al. OpenGrowth: An Automated and Rational Algorithm for Finding New Protein Ligands. , 2016, Journal of medicinal chemistry.
[150] Erik Schultes,et al. The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.
[151] Mona Singh,et al. molBLOCKS: decomposing small molecule sets and uncovering enriched fragments , 2014, Bioinform..
[152] Xiao Li,et al. A High Efficient Biological Language Model for Predicting Protein–Protein Interactions , 2019, Cells.
[153] Thomas Blaschke,et al. Application of Generative Autoencoder in De Novo Molecular Design , 2017, Molecular informatics.
[154] Elif Ozkirimli,et al. WideDTA: prediction of drug-target binding affinity , 2019, ArXiv.
[155] David Baker,et al. Macromolecular modeling with rosetta. , 2008, Annual review of biochemistry.
[156] Jian Peng,et al. Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields , 2015, Scientific Reports.
[157] Ola Spjuth,et al. Prediction of Metabolic Transformations using Cross Venn-ABERS Predictors , 2017, COPA.
[158] Christopher A. Hunter,et al. Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction , 2018, ACS central science.
[159] Ping Zhang,et al. Interpretable Drug Target Prediction Using Deep Neural Representation , 2018, IJCAI.
[160] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[161] Ronald J. Williams,et al. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.
[162] Daniel W. A. Buchan,et al. Inferring Protein Domain Semantic Roles Using word2vec , 2019 .
[163] Bonggun Shin,et al. Self-Attention Based Molecule Representation for Predicting Drug-Target Interaction , 2019, MLHC.
[164] Renxiao Wang,et al. The PDBbind database: methodologies and updates. , 2005, Journal of medicinal chemistry.
[165] Yurii S. Moroz,et al. Ultra-large library docking for discovering new chemotypes , 2019, Nature.
[166] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[167] Noel M. O'Boyle,et al. DeepSMILES: An Adaptation of SMILES for Use in Machine-Learning of Chemical Structures , 2018 .
[168] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[169] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[170] Bowen Liu,et al. Retrosynthetic Reaction Prediction Using Neural Sequence-to-Sequence Models , 2017, ACS central science.