Artificial Intelligence in Decrypting Cytoprotective Activity under Oxidative Stress from Molecular Structure

Artificial intelligence (AI) is widely explored nowadays, and it gives opportunities to enhance classical approaches in QSAR studies. The aim of this study was to investigate the cytoprotective activity parameter under oxidative stress conditions for indole-based structures, with the ultimate goal of developing AI models capable of predicting cytoprotective activity and generating novel indole-based compounds. We propose a new AI system capable of suggesting new chemical structures based on some known cytoprotective activity. Cytoprotective activity prediction models, employing algorithms such as random forest, decision tree, support vector machines, K-nearest neighbors, and multiple linear regression, were built, and the best (based on quality measurements) was used to make predictions. Finally, the experimental evaluation of the computational results was undertaken in vitro. The proposed methodology resulted in the creation of a library of new indole-based compounds with assigned cytoprotective activity. The other outcome of this study was the development of a validated predictive model capable of estimating cytoprotective activity to a certain extent using molecular structure as input, supported by experimental confirmation.

[1]  L. Mrówczyńska,et al.  Novel gramine-based bioconjugates obtained by click chemistry as cytoprotective compounds and potent antibacterial and antifungal agents. , 2023, Natural product research.

[2]  L. Mrówczyńska,et al.  Indole Derivatives Bearing Imidazole, Benzothiazole-2-Thione or Benzoxazole-2-Thione Moieties—Synthesis, Structure and Evaluation of Their Cytoprotective, Antioxidant, Antibacterial and Fungicidal Activities , 2023, Molecules.

[3]  M. Hoffmann,et al.  Neural Networks in the Design of Molecules with Affinity to Selected Protein Domains , 2023, International journal of molecular sciences.

[4]  L. Mrówczyńska,et al.  Synthesis, antioxidant and cytoprotective activity evaluation of C-3 substituted indole derivatives , 2021, Scientific Reports.

[5]  I. Kowałczyk,et al.  New triazole-bearing gramine derivatives – synthesis, structural analysis and protective effect against oxidative haemolysis , 2020, Natural product research.

[6]  Atukuri Dorababu Indole - a promising pharmacophore in recent antiviral drug discovery. , 2020, RSC medicinal chemistry.

[7]  L. O’Donnell,et al.  Support Vector Regression , 2020, Wiley StatsRef: Statistics Reference Online.

[8]  Yunqiang Bian,et al.  Indole/isatin‐containing hybrids as potential antibacterial agents , 2020, Archiv der Pharmazie.

[9]  Ola Engkvist,et al.  Direct steering of de novo molecular generation with descriptor conditional recurrent neural networks , 2020, Nature Machine Intelligence.

[10]  D. Svozil,et al.  SYBA: Bayesian estimation of synthetic accessibility of organic compounds , 2020, Journal of Cheminformatics.

[11]  R. Singh,et al.  Medicinal chemistry of indole derivatives: Current to future therapeutic prospectives. , 2019, Bioorganic chemistry.

[12]  Alán Aspuru-Guzik,et al.  Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation , 2019, Mach. Learn. Sci. Technol..

[13]  M. Skrobańska,et al.  Spectroscopy, molecular modeling and anti-oxidant activity studies on novel conjugates containing indole and uracil moiety , 2018, Journal of Molecular Structure.

[14]  Esben Jannik Bjerrum,et al.  Improving Chemical Autoencoder Latent Space and Molecular De Novo Generation Diversity with Heteroencoders , 2018, Biomolecules.

[15]  P. Abete,et al.  Oxidative stress, aging, and diseases , 2018, Clinical interventions in aging.

[16]  Saeed Emami,et al.  Indole in the target-based design of anticancer agents: A versatile scaffold with diverse mechanisms. , 2018, European journal of medicinal chemistry.

[17]  Tatsuya Takagi,et al.  Mordred: a molecular descriptor calculator , 2018, Journal of Cheminformatics.

[18]  Thierry Kogej,et al.  Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ACS central science.

[19]  Esben Jannik Bjerrum,et al.  Molecular Generation with Recurrent Neural Networks (RNNs) , 2017, ArXiv.

[20]  Rahul Khanna,et al.  Efficient Learning Machines , 2015, Apress.

[21]  M. Schluchter Mean Square Error , 2014, Encyclopedic Dictionary of Archaeology.

[22]  G. Maggiora,et al.  Molecular similarity in medicinal chemistry. , 2014, Journal of medicinal chemistry.

[23]  Samuel R. Mendes,et al.  Synthesis and antioxidant activity of new C-3 sulfenyl indoles , 2013 .

[24]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[25]  R. Das,et al.  On some novel extended topochemical atom (ETA) parameters for effective encoding of chemical information and modelling of fundamental physicochemical properties , 2011, SAR and QSAR in environmental research.

[26]  CHUN WEI YAP,et al.  PaDEL‐descriptor: An open source software to calculate molecular descriptors and fingerprints , 2011, J. Comput. Chem..

[27]  H. Hägerstrand,et al.  Platelet‐activating factor interaction with the human erythrocyte membrane , 2009, Journal of biochemical and molecular toxicology.

[28]  Kunal Roy,et al.  QSTR with extended topochemical atom indices. Part 5: Modeling of the acute toxicity of phenylsulfonyl carboxylates to Vibrio fischeri using genetic function approximation. , 2005, Bioorganic & medicinal chemistry.

[29]  K. Roy,et al.  QSTR with Extended Topochemical Atom Indices. 2. Fish Toxicity of Substituted Benzenes , 2004, J. Chem. Inf. Model..

[30]  M. Safar,et al.  Mechanism(s) of selective systolic blood pressure reduction after a low-dose combination of perindopril/indapamide in hypertensive subjects: comparison with atenolol. , 2004, Journal of the American College of Cardiology.

[31]  C. Anderson‐Cook The Cambridge Dictionary of Statistics (2nd ed.) , 2003 .

[32]  Jürgen Bajorath,et al.  Selected Concepts and Investigations in Compound Classification, Molecular Descriptor Analysis, and Virtual Screening , 2001, J. Chem. Inf. Comput. Sci..

[33]  D. Goodin The cambridge dictionary of statistics , 1999 .

[34]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[35]  J. Jobson Applied Multivariate Data Analysis , 1995 .

[36]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[37]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[38]  J. Atterhög,et al.  Experience with pindolol, a betareceptor blocker, in the treatment of hypertension. , 1976, The American journal of medicine.

[39]  Alexander Schwartz,et al.  Decision Analysis And Behavioral Research , 2016 .

[40]  Adrià Cereto-Massagué,et al.  Molecular fingerprint similarity search in virtual screening. , 2015, Methods.

[41]  Jerzy Leszczynski,et al.  Handbook of Computational Chemistry , 2012 .

[42]  E. De Clercq,et al.  Recent advances in DAPYs and related analogues as HIV-1 NNRTIs. , 2011, Current medicinal chemistry.

[43]  Johannes Fürnkranz,et al.  Mean Absolute Error , 2010, Encyclopedia of Machine Learning and Data Mining.

[44]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[45]  Lynn E Eberly,et al.  Multiple linear regression. , 2007, Methods in molecular biology.

[46]  Pearson Education Speech and language processing: an introduction to natural language processing , 2000 .

[47]  H. Dell,et al.  [On the pharmacodynamics of acemetacin (author's transl)]. , 1980, Arzneimittel-Forschung.

[48]  D. Jurafsky,et al.  Speech and Language Processing: an Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. John Hid Bill's Car Keys. He Was Drunk. 21.1 Discourse Segmentation 21.1.1 Unsupervised Discourse Segmentation , 2022 .