Independent component analysis for the extraction of reliable protein signal profiles from MALDI-TOF mass spectra

MOTIVATION Independent component analysis (ICA) is a signal processing technique that can be utilized to recover independent signals from a set of their linear mixtures. We propose ICA for the analysis of signals obtained from large proteomics investigations such as clinical multi-subject studies based on MALDI-TOF MS profiling. The method is validated on simulated and experimental data for demonstrating its capability of correctly extracting protein profiles from MALDI-TOF mass spectra. RESULTS The comparison on peak detection with an open-source and two commercial methods shows its superior reliability in reducing the false discovery rate of protein peak masses. Moreover, the integration of ICA and statistical tests for detecting the differences in peak intensities between experimental groups allows to identify protein peaks that could be indicators of a diseased state. This data-driven approach demonstrates to be a promising tool for biomarker-discovery studies based on MALDI-TOF MS technology. AVAILABILITY The MATLAB implementation of the method described in the article and both simulated and experimental data are freely available at http://www.unich.it/proteomica/bioinf/.

[1]  Andreas Ziehe,et al.  Artifact Reduction in Magnetoneurography Based on Time-Delayed Second Order Correlations , 1998 .

[2]  Andrew D. Back,et al.  A First Application of Independent Component Analysis to Extracting Structure from Stock Returns , 1997, Int. J. Neural Syst..

[3]  David Lindgren,et al.  Independent component analysis reveals new and biologically significant structures in micro array data , 2006, BMC Bioinformatics.

[4]  Dante Mantini,et al.  LIMPIC: a computational method for the separation of protein MALDI-TOF-MS signals from noise , 2007, BMC Bioinformatics.

[5]  Aapo Hyv Fast and Robust Fixed-Point Algorithms for Independent Component Analysis , 1999 .

[6]  James V. Stone Independent Component Analysis: A Tutorial Introduction , 2007 .

[7]  T. Sejnowski,et al.  Analysis and visualization of single‐trial event‐related potentials , 2001, Human brain mapping.

[8]  Joachim Selbig,et al.  Metabolite fingerprinting: detecting biological features by independent component analysis , 2004, Bioinform..

[9]  S Comani,et al.  A method for the automatic reconstruction of fetal cardiac signals from magnetocardiographic recordings , 2005, Physics in medicine and biology.

[10]  Sheng-De Wang,et al.  Robust algorithms for principal component analysis , 1999, Pattern Recognit. Lett..

[11]  Y. Yasui,et al.  An Automated Peak Identification/Calibration Procedure for High-Dimensional Protein Measures From Mass Spectrometers , 2003, Journal of biomedicine & biotechnology.

[12]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[13]  D. Chakrabarti,et al.  A fast fixed - point algorithm for independent component analysis , 1997 .

[14]  J. Foley,et al.  Equations for chromatographic peak modeling and calculation of peak area , 1987 .

[15]  Jeffrey S. Morris,et al.  Improved peak detection and quantification of mass spectrometry data acquired from surface‐enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform , 2005, Proteomics.

[16]  G. Bernardi,et al.  Differential post‐translational modifications of transthyretin in Alzheimer's disease: A study of the cerebral spinal fluid , 2006, Proteomics.

[17]  S. Arimoto,et al.  On a multi-stage nonlinear programming problem , 1967 .

[18]  Wolfram Liebermeister,et al.  Linear modes of gene expression determined by independent component analysis , 2002, Bioinform..

[19]  Somnath Datta,et al.  Standardization and denoising algorithms for mass spectra to classify whole-organism bacterial specimens , 2004, Bioinform..

[20]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[21]  Ian S. Burnett,et al.  An analysis of the limitations of blind signal separation application with speech , 2006, Signal Process..

[22]  G. Hortin The MALDI-TOF mass spectrometric view of the plasma proteome and peptidome. , 2006, Clinical chemistry.

[23]  P. Bondarenko,et al.  Mass spectral study of polymorphism of the apolipoproteins of very low density lipoprotein. , 1999, Journal of lipid research.

[24]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[25]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[26]  R D Appel,et al.  Improving protein identification from peptide mass fingerprinting through a parameterized multi‐level scoring algorithm and an optimized peak detection , 1999, Electrophoresis.

[27]  M. Karas Matrix-assisted laser desorption ionization MS: a progress report. , 1996, Biochemical Society transactions.

[28]  Neal O. Jeffries,et al.  Algorithms for alignment of mass spectrometry proteomic data , 2005, Bioinform..

[29]  Antoine Souloumiac,et al.  Jacobi Angles for Simultaneous Diagonalization , 1996, SIAM J. Matrix Anal. Appl..

[30]  Aapo Hyvärinen,et al.  Independent Component Analysis: Fast ICA by a fixed-point algorithm that maximizes non-Gaussianity , 2001 .

[31]  E. Diamandis Mass Spectrometry as a Diagnostic and a Cancer Biomarker Discovery Tool , 2004, Molecular & Cellular Proteomics.

[32]  Christopher J James,et al.  Independent component analysis for biomedical signals , 2005, Physiological measurement.