Detecting and aligning peaks in mass spectrometry data with applications to MALDI

In this paper, we address the peak detection and alignment problem in the analysis of mass spectrometry data. To deal with the peak redundancy problem existing in the MALDI data acquired in the reflectron mode, we propose to use the amplitude modulation technique in peak detection. The alignment of two peak sets is formulated as a non-rigid registration problem and is solved using a robust point matching (RPM) approach. To align multiple peak sets, we first use a super set method to find a common peak set among all peak sets as a standard and then align all peak sets to the standard using the robust point matching approach in a sequential manner (i.e. We align only one peak set to the standard each time, thus reducing the multiple peak set alignment problem to a simpler two peak set alignment problem). Experimental results from a study of ovarian cancer data set show that the quantitative cross-correlation coefficients among technical replicates are increased after peak alignment. Additional comparisons also demonstrate that our method has a similar performance as the hierarchical clustering method, although the implementations of these methods are different.

[1]  Edmond J. Breen,et al.  Automatic Poisson peak harvesting for high throughput protein identification , 2000, Electrophoresis.

[2]  J. Potter,et al.  A data-analytic strategy for protein biomarker discovery: profiling of high-dimensional proteomic data for cancer detection. , 2003, Biostatistics.

[3]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[4]  Ralf J. O. Torgrip,et al.  Peak alignment using reduced set mapping , 2003 .

[5]  David Ward,et al.  Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data , 2003, Bioinform..

[6]  A. Pothen,et al.  Protocols for disease classification from mass spectrometry data , 2003, Proteomics.

[7]  Jeffrey S. Morris,et al.  Quality control and peak finding for proteomics data collected from nipple aspirate fluid by surface-enhanced laser desorption and ionization. , 2003, Clinical chemistry.

[8]  Hans Knutsson,et al.  Signal processing for computer vision , 1994 .

[9]  R D Appel,et al.  Improving protein identification from peptide mass fingerprinting through a parameterized multi‐level scoring algorithm and an optimized peak detection , 1999, Electrophoresis.

[10]  George M. Church,et al.  Aligning gene expression time series with time warping algorithms , 2001, Bioinform..

[11]  Anand Rangarajan,et al.  A new point matching algorithm for non-rigid registration , 2003, Comput. Vis. Image Underst..

[12]  Y. Yasui,et al.  An Automated Peak Identification/Calibration Procedure for High-Dimensional Protein Measures From Mass Spectrometers , 2003, Journal of biomedicine & biotechnology.

[13]  Robert Tibshirani,et al.  Sample classification from protein mass spectrometry, by 'peak probability contrasts' , 2004, Bioinform..