Reversible jump MCMC approach for peak identification for stroke SELDI mass spectrometry using mixture model

Mass spectrometry (MS) has shown great potential in detecting disease-related biomarkers for early diagnosis of stroke. To discover potential biomarkers from large volume of noisy MS data, peak detection must be performed first. This article proposes a novel automatic peak detection method for the stroke MS data. In this method, a mixture model is proposed to model the spectrum. Bayesian approach is used to estimate parameters of the mixture model, and Markov chain Monte Carlo method is employed to perform Bayesian inference. By introducing a reversible jump method, we can automatically estimate the number of peaks in the model. Instead of separating peak detection into substeps, the proposed peak detection method can do baseline correction, denoising and peak identification simultaneously. Therefore, it minimizes the risk of introducing irrecoverable bias and errors from each substep. In addition, this peak detection method does not require a manually selected denoising threshold. Experimental results on both simulated dataset and stroke MS dataset show that the proposed peak detection method not only has the ability to detect small signal-to-noise ratio peaks, but also greatly reduces false detection rate while maintaining the same sensitivity. Contact: XZhou@tmhs.org

[1]  Min Zhan,et al.  A data review and re-assessment of ovarian cancer serum proteomic profiling , 2003, BMC Bioinformatics.

[2]  Y. Yasui,et al.  An Automated Peak Identification/Calibration Procedure for High-Dimensional Protein Measures From Mass Spectrometers , 2003, Journal of biomedicine & biotechnology.

[3]  Karin Noy,et al.  Improved model-based, platform-independent feature extraction for mass spectrometry , 2007, Bioinform..

[4]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[5]  Jeffrey S. Morris,et al.  Understanding the characteristics of mass spectrometry data through the use of simulation , 2005, Cancer informatics.

[6]  E. Fung,et al.  ProteinChip clinical proteomics: computational challenges and solutions. , 2002, BioTechniques.

[7]  Nando de Freitas,et al.  Robust Full Bayesian Learning for Radial Basis Networks , 2001, Neural Computation.

[8]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[9]  Jeffrey S. Morris,et al.  Improved peak detection and quantification of mass spectrometry data acquired from surface‐enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform , 2005, Proteomics.

[10]  Martijn Dijkstra,et al.  Peak quantification in surface‐enhanced laser desorption/ionization by using mixture models , 2006, Proteomics.

[11]  D. Chan,et al.  Serum Diagnosis of Pancreatic Adenocarcinoma Using Surface-Enhanced Laser Desorption and Ionization Mass Spectrometry , 2004, Clinical Cancer Research.

[12]  Jeffrey S. Morris,et al.  A comprehensive approach to the analysis of matrix‐assisted laser desorption/ionization‐time of flight proteomics spectra from serum samples , 2003, Proteomics.

[13]  T W Randolph,et al.  Multiscale Processing of Mass Spectrometry Data , 2006, Biometrics.

[14]  R. Jansen,et al.  SELDI-TOF mass spectra: a view on sources of variation. , 2007, Journal of chromatography. B, Analytical technologies in the biomedical and life sciences.

[15]  M. Hilario,et al.  Processing and classification of protein mass spectra. , 2006, Mass spectrometry reviews.

[16]  Wei Zhu,et al.  Feature extraction in the analysis of proteomic mass spectra , 2006, Proteomics.

[17]  M. Trosset,et al.  Enhancement of sensitivity and resolution of surface-enhanced laser desorption/ionization time-of-flight mass spectrometric records for serum peptides using time-series analysis techniques. , 2005, Clinical chemistry.

[18]  S. Weinberger,et al.  Protein quantification by the SELDI-TOF-MS–based ProteinChip® System , 2005 .

[19]  Jeffrey S. Morris,et al.  Reproducibility of SELDI-TOF protein patterns in serum: comparing datasets from different experiments , 2004, Bioinform..

[20]  DuPan,et al.  Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching , 2006 .

[21]  Thomas P Conrads,et al.  The SELDI-TOF MS approach to proteomics: protein profiling and biomarker identification. , 2002, Biochemical and biophysical research communications.

[22]  Pan Du,et al.  Bioinformatics Original Paper Improved Peak Detection in Mass Spectrum by Incorporating Continuous Wavelet Transform-based Pattern Matching , 2022 .

[23]  Jeffrey S. Morris,et al.  Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum , 2005, Bioinform..

[24]  Andreas Quandt,et al.  Finding regions of significance in SELDI measurements for identifying protein biomarkers , 2006, Bioinform..

[25]  Marvin L. Vestal,et al.  Resolution and mass accuracy in matrix-assisted laser desorption ionization-time-of-flight , 1998 .