Fast and shift-insensitive similarity comparisons of NMR using a tree-representation of spectra

An efficient method to extract and store information from NMR spectra is proposed that is suitable for comparison and construction of a search engine. This method based on trees doesn't require any peak picking or any pre-treatment of the data and is found to outperform the currently available methods, both in terms of compactness and velocity. Our approach was tested for 1D proton spectra and 2D HSQC spectra and compared with the method proposed by Pretsch and coworkers [1,2] [Bodis et al. 2007, Bodis et al. 2009]. Additionally, the correspondence between spectral and structural similarity was evaluated for both methods. (C) 2013 Elsevier By. All rights reserved.

[1]  T. Arakawa,et al.  Dehydration-induced conformational transitions in proteins and their inhibition by stabilizers. , 1993, Biophysical journal.

[2]  E. Pretsch,et al.  Automatic compatibility tests of HSQC NMR spectra with proposed structures of chemical compounds. , 2009, Talanta.

[3]  S J Prestrelski,et al.  Separation of freezing- and drying-induced denaturation of lyophilized proteins using stress-specific stabilization. II. Structural studies using infrared spectroscopy. , 1993, Archives of biochemistry and biophysics.

[4]  K. Varmuza,et al.  Spectral similarity versus structural similarity: infrared spectroscopy , 2003 .

[5]  David S. Wishart,et al.  MetaboMiner – semi-automated identification of metabolites from 2D NMR spectra of complex biofluids , 2008, BMC Bioinformatics.

[6]  E. Pretsch,et al.  General theory of similarity measures for library search systems , 1988 .

[7]  K. Varmuza,et al.  Spectral similarity versus structural similarity: mass spectrometry , 2004 .

[8]  Thomas Sander,et al.  OSIRIS, an Entirely in-House Developed Drug Discovery Informatics System , 2009, J. Chem. Inf. Model..

[9]  David S. Wishart,et al.  HMDB 3.0—The Human Metabolome Database in 2013 , 2012, Nucleic Acids Res..

[10]  S J Prestrelski,et al.  Separation of freezing- and drying-induced denaturation of lyophilized proteins using stress-specific stabilization. I. Enzyme activity and calorimetric studies. , 1993, Archives of biochemistry and biophysics.

[11]  Rafael Brüschweiler,et al.  Web server based complex mixture analysis by NMR. , 2008, Analytical chemistry.

[12]  João Aires-de-Sousa,et al.  Prediction of 1H NMR Coupling Constants with Associative Neural Networks Trained for Chemical Shifts , 2007, J. Chem. Inf. Model..

[13]  Andrés M. Castillo,et al.  Fast and accurate algorithm for the simulation of NMR spectra of large spin systems. , 2011, Journal of magnetic resonance.

[14]  Christoph Steinbeck,et al.  NMRShiftDB -- compound identification and structure elucidation support through a free community-built web database. , 2004, Phytochemistry.

[15]  Alexander Hinneburg,et al.  Duplicate detection of 2D-NMR Spectra , 2007, J. Integr. Bioinform..

[16]  M. Manning,et al.  Quantitation of the area of overlap between second-derivative amide I infrared spectra to determine the structural similarity of a protein in different states. , 1996, Journal of pharmaceutical sciences.

[17]  Simon K. Kearsley,et al.  Using similarity searches over databases of estimated 13C NMR spectra for structure identification of natural product compounds , 1995 .

[18]  João Aires-de-Sousa,et al.  The Impact of Available Experimental Data on the Prediction of 1H NMR Chemical Shifts by Neural Networks , 2004, J. Chem. Inf. Model..

[19]  J. Meiler PROSHIFT: Protein chemical shift prediction using artificial neural networks , 2003, Journal of biomolecular NMR.

[20]  Ramise Raja Mondal Laboratory Information Management System , 2014 .

[21]  John L Markley,et al.  Metabolite identification via the Madison Metabolomics Consortium Database , 2008, Nature Biotechnology.

[22]  E. Pretsch,et al.  A novel spectra similarity measure , 2007 .

[23]  Elizabeth Turner,et al.  Laboratory Information Management Systems , 2001 .

[24]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[25]  Peter Lundberg,et al.  MDL– the magnetic resonance metabolomics database , 2005 .