WaVPeak: picking NMR peaks through wavelet-based smoothing and volume-based filtering

Motivation: Nuclear magnetic resonance (NMR) has been widely used as a powerful tool to determine the 3D structures of proteins in vivo. However, the post-spectra processing stage of NMR structure determination usually involves a tremendous amount of time and expert knowledge, which includes peak picking, chemical shift assignment and structure calculation steps. Detecting accurate peaks from the NMR spectra is a prerequisite for all following steps, and thus remains a key problem in automatic NMR structure determination. Results: We introduce WaVPeak, a fully automatic peak detection method. WaVPeak first smoothes the given NMR spectrum by wavelets. The peaks are then identified as the local maxima. The false positive peaks are filtered out efficiently by considering the volume of the peaks. WaVPeak has two major advantages over the state-of-the-art peak-picking methods. First, through wavelet-based smoothing, WaVPeak does not eliminate any data point in the spectra. Therefore, WaVPeak is able to detect weak peaks that are embedded in the noise level. NMR spectroscopists need the most help isolating these weak peaks. Second, WaVPeak estimates the volume of the peaks to filter the false positives. This is more reliable than intensity-based filters that are widely used in existing methods. We evaluate the performance of WaVPeak on the benchmark set proposed by PICKY (Alipanahi et al., 2009), one of the most accurate methods in the literature. The dataset comprises 32 2D and 3D spectra from eight different proteins. Experimental results demonstrate that WaVPeak achieves an average of 96%, 91%, 88%, 76% and 85% recall on 15N-HSQC, HNCO, HNCA, HNCACB and CBCA(CO)NH, respectively. When the same number of peaks are considered, WaVPeak significantly outperforms PICKY. Availability: WaVPeak is an open source program. The source code and two test spectra of WaVPeak are available at http://faculty.kaust.edu.sa/sites/xingao/Pages/Publications.aspx. The online server is under construction. Contact: statliuzhi@xmu.edu.cn; ahmed.abbas@kaust.edu.sa; majing@ust.hk; xin.gao@kaust.edu.sa

[1]  Ingrid Daubechies,et al.  Ten Lectures on Wavelets , 1992 .

[2]  K. Wüthrich NMR of proteins and nucleic acids , 1988 .

[3]  Xin Gao,et al.  PICKY: a novel SVD-based NMR spectra peak picking method , 2009, Bioinform..

[4]  K. Wüthrich,et al.  Protein NMR structure determination with automated NOE-identification in the NOESY spectra using the new software ATNOS , 2002, Journal of biomolecular NMR.

[5]  H. Kalbitzer,et al.  A general Bayesian method for an automated signal class recognition in 2D NMR spectra combined with a multivariate discriminant analysis , 1995, Journal of biomolecular NMR.

[6]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Ulrich Günther,et al.  Automated Protein NMR Structure Determination Using Wavelet De-noised NOESY Spectra , 2005, Journal of biomolecular NMR.

[8]  Menglong Li,et al.  Wavelet transform analysis of NMR structure ensembles to reveal internal fluctuations of enzymes , 2011, Amino Acids.

[9]  M. Billeter,et al.  MUNIN: A new approach to multi-dimensional NMR spectra interpretation , 2001, Journal of biomolecular NMR.

[10]  Martin Billeter,et al.  MUNIN: Application of three-way decomposition to the analysis of heteronuclear NMR relaxation data** , 2001, Journal of biomolecular NMR.

[11]  C. Burrus,et al.  Noise reduction using an undecimated discrete wavelet transform , 1996, IEEE Signal Processing Letters.

[12]  A. Rouh,et al.  Bayesian signal extraction from noisy FT NMR spectra , 1994, Journal of Biomolecular NMR.

[13]  Jean-Marie Dereppe,et al.  The continuous wavelet transform, an analysis tool for NMR spectroscopy , 1997 .

[14]  Xin Gao,et al.  Towards Fully Automated Structure-Based NMR Resonance Assignment of 15N-Labeled Proteins From Automatically Picked Peaks , 2011, J. Comput. Biol..

[15]  M. Billeter,et al.  Automated peak picking and peak integration in macromolecular NMR spectra using AUTOPSY. , 1998, Journal of magnetic resonance.

[16]  Shuai Cheng Li,et al.  IPASS : Error Tolerant NMR Backbone Resonance Assignment by Linear Programming , 2009 .

[17]  Claudio Nicolini,et al.  Neural networks for the peak-picking of nuclear magnetic resonance spectra , 1993, Neural Networks.

[18]  Peter Güntert,et al.  Automated structure determination from NMR spectra , 2009, European Biophysics Journal.

[19]  Simon A. Corne,et al.  An artificial neural network for classifying cross peaks in two-dimensional NMR spectra , 1992 .

[20]  Robert Powers,et al.  A common sense approach to peak picking in two-, three-, and four-dimensional spectra using automatic computer analysis of contour diagrams , 1991 .

[21]  Gerard J. Kleywegt,et al.  A versatile approach toward the partially automatic recognition of cross peaks in 2D 1H NMR spectra , 1990 .

[22]  M. Williamson,et al.  Automated protein structure calculation from NMR data , 2009, Journal of biomolecular NMR.

[23]  A. Altieri,et al.  Automation of NMR structure determination of proteins. , 2004, Current opinion in structural biology.

[24]  Ludwig,et al.  NMRLAB-Advanced NMR data processing in matlab , 2000, Journal of magnetic resonance.

[25]  Heinz Rüterjans,et al.  WAVEWAT-improved solvent suppression in NMR spectra employing wavelet transforms. , 2002, Journal of magnetic resonance.

[26]  Xin Gao,et al.  Towards Automated Structure-Based NMR Resonance Assignment , 2010, RECOMB.

[27]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[28]  Bruce A. Johnson,et al.  NMR View: A computer program for the visualization and analysis of NMR data , 1994, Journal of biomolecular NMR.

[29]  G Neue,et al.  Simplification of dynamic NMR spectroscopy by wavelet transforms. , 1996, Solid state nuclear magnetic resonance.

[30]  Xueguang Shao,et al.  Resolution of the NMR Spectrum Using Wavelet Transform , 2000 .

[31]  W. Gronwald,et al.  Automated structure determination of proteins by NMR spectroscopy , 2004 .