Data Base similarity (DBsimilarity) of natural products to aid compound identification on MS and NMR pipelines, similarity networking, and more.

INTRODUCTION We developed Data Base similarity (DBsimilarity), a user-friendly tool designed to organize structure databases into similarity networks, with the goal of facilitating the visualization of information primarily for natural product chemists who may not have coding experience. METHOD DBsimilarity, written in Jupyter Notebooks, converts Structure Data File (SDF) files into Comma-Separated Values (CSV) files, adds chemoinformatics data, constructs an MZMine custom database file and an NMRfilter candidate list of compounds for rapid dereplication of MS and 2D NMR data, calculates similarities between compounds, and constructs CSV files formatted into similarity networks for Cytoscape. RESULTS The Lotus database was used as a source for Ginkgo biloba compounds, and DBsimilarity was used to create similarity networks including NPClassifier classification to indicate biosynthesis pathways. Subsequently, a database of validated antibiotics from natural products was combined with the G. biloba compounds to identify promising compounds. The presence of 11 compounds in both datasets points to possible antibiotic properties of G. biloba, and 122 compounds similar to these known antibiotics were highlighted. Next, DBsimilarity was used to filter the NPAtlas database (selecting only those with MIBiG reference) to identify potential antibacterial compounds using the ChEMBL database as a reference. It was possible to promptly identify five compounds found in both databases and 167 others worthy of further investigation. CONCLUSION Chemical and biological properties are determined by molecular structures. DBsimilarity enables the creation of interactive similarity networks using Cytoscape. It is also in line with a recent review that highlights poor biological plausibility and unrealistic chromatographic behaviors as significant sources of errors in compound identification.

[1]  R. Goodacre,et al.  Ensuring Fact-Based Metabolite Identification in Liquid Chromatography–Mass Spectrometry-Based Metabolomics , 2023, Analytical chemistry.

[2]  Allegra T. Aron,et al.  ConCISE: Consensus Annotation Propagation of Ion Features in Untargeted Tandem Mass Spectrometry Combining Molecular Networking and In Silico Metabolite Structure Prediction , 2022, Metabolites.

[3]  Fabio Caraffini,et al.  Direct deduction of chemical class from NMR spectra , 2022, Journal of magnetic resonance.

[4]  K. Héberger,et al.  Comparison of Descriptor- and Fingerprint Sets in Machine Learning Models for ADME-Tox Targets , 2022, Frontiers in Chemistry.

[5]  R. Borges,et al.  Combining high-speed countercurrent chromatography three-phase solvent system with electrospray ionization-mass spectrometry and nuclear magnetic resonance to profile the unconventional food plant Syzygium malaccense. , 2022, Journal of chromatography. A.

[6]  A. Blanc,et al.  Targeted modifications of neomycin and paromomycin: Towards resistance-free antibiotics? , 2022, Bioorganic chemistry.

[7]  M. Oh,et al.  Systems metabolic engineering of Streptomyces venezuelae for the enhanced production of pikromycin , 2022, Biotechnology and bioengineering.

[8]  Roger G. Linington,et al.  Dereplication of Fungal Metabolites by NMR-Based Compound Networking Using MADByTE , 2022, Journal of natural products.

[9]  Roger G. Linington,et al.  NP-MRD: the Natural Products Magnetic Resonance Database , 2021, Nucleic Acids Res..

[10]  Mohammad Y. Alshahrani,et al.  Computational Screening of Natural Compounds for Identification of Potential Anti-Cancer Agents Targeting MCM7 Protein , 2021, Molecules.

[11]  Jamie R. Nuñez,et al.  Quantum Chemistry Calculations for Metabolomics , 2021, Chemical reviews.

[12]  R. Brüschweiler,et al.  2D NMR-Based Metabolomics with HSQC/TOCSY NOAH Supersequences. , 2021, Analytical chemistry.

[13]  J. Argüelles,et al.  TREHALASE INHIBITION BY VALIDAMYCIN A MAY BE A PROMISING TARGET TO DESIGN NEW FUNGICIDES AND INSECTICIDES. , 2021, Pest management science.

[14]  C. Quave,et al.  Ethnobotany and the Role of Plant Natural Products in Antibiotic Drug Discovery. , 2020, Chemical reviews.

[15]  Justin J. J. van der Hooft,et al.  NPClassifier: A Deep Neural Network-Based Structural Classification Tool for Natural Products , 2020, Journal of natural products.

[16]  J. Žiarovská,et al.  Properties of Ginkgo biloba L.: Antioxidant Characterization, Antimicrobial Activities, and Genomic MicroRNA Based Marker Fingerprints , 2020, International journal of molecular sciences.

[17]  H. H. Mao,et al.  A Convolutional Neural Network-Based Approach for the Rapid Characterization of Molecularly Diverse Natural Products. , 2020, Journal of the American Chemical Society.

[18]  Justin J. J. van der Hooft,et al.  The Natural Products Atlas: An Open Access Knowledge Base for Microbial Natural Products Discovery , 2019, ACS central science.

[19]  Ali Mohsin,et al.  Study on production enhancement of validamycin A using online capacitance measurement coupled with 1H NMR spectroscopy analysis in a plant-scale bioreactor , 2017 .

[20]  C. Pannecouque,et al.  Bioactive Natural Products Prioritization Using Massive Multi-informational Molecular Networks. , 2017, ACS chemical biology.

[21]  Jean-Marc Nuzillard,et al.  Computer-Aided 13C NMR Chemical Profiling of Crude Natural Extracts without Fractionation. , 2017, Journal of natural products.

[22]  J. Clardy,et al.  Selvamicin, an atypical antifungal polyene from two alternative genomic contexts , 2016, Proceedings of the National Academy of Sciences.

[23]  Neha Garg,et al.  Dereplication of peptidic natural products through database search of mass spectra , 2016, Nature chemical biology.

[24]  E. Dias,et al.  Assessing the antibiotic susceptibility of freshwater Cyanobacteria spp. , 2015, Front. Microbiol..

[25]  P. Shannon,et al.  Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks , 2003 .

[26]  H. L. Morgan The Generation of a Unique Machine Description for Chemical Structures-A Technique Developed at Chemical Abstracts Service. , 1965 .