Exploring liquid chromatography–mass spectrometry fingerprints of urine samples from patients with prostate or urinary bladder cancer

Abstract Data processing and analysis have become true rate and success limiting factors for molecular research where a large number of samples of high complexity are included in the data set. In general rather complicated methodologies are needed for the combination and comparison of information as obtained from selected analytical platforms. Although commercial as well as freely accessible software for high-throughput data processing are available for most platforms, tailored in-house solutions for data management and analysis can provide the versatility and transparency eligible for e.g. method development and pilot studies. This paper describes a procedure for exploring metabolic fingerprints in urine samples from prostate and bladder cancer patients with a set of in-house developed Matlab tools. In spite of the immense amount of data produced by the LC–MS platform, in this study more than 10 10 data points, it is shown that the data processing tasks can be handled with reasonable computer resources. The preprocessing steps include baseline subtraction and noise reduction, followed by an initial time alignment. In the data analysis the fingerprints are treated as 2-D images, i.e. pixel by pixel, in contrast to the more common list-based approach after peak or feature detection. Although the latter approach greatly reduces the data complexity, it also involves a critical step that may obscure essential information due to undetected or misaligned peaks. The effects of remaining time shifts after the initial alignment are reduced by a binning and ‘blurring’ procedure prior to the comparative multivariate and univariate data analyses. Other factors than cancer assignment were taken into account by ANOVA applied to the PCA scores as well as to the individual variables (pixels). It was found that the analytical day-to-day variations in our study had a large confounding effect on the cancer related differences, which emphasizes the role of proper normalization and/or experimental design. While PCA could not establish significant cancer related patterns, the pixel-wise univariate analysis could provide a list of about a hundred ‘hotspots’ indicating possible biomarkers. This was also the limited goal for this study, with focus on the exploration of a really huge and complex data set. True biomarker identification, however, needs thorough validation and verification in separate patient sets.

[1]  Kathleen N Lohr,et al.  Screening for Prostate Cancer: An Update of the Evidence for the U.S. Preventive Services Task Force , 2002, Annals of Internal Medicine.

[2]  E. Deutsch mzML: A single, unifying data format for mass spectrometer output , 2008, Proteomics.

[3]  Royston Goodacre,et al.  Metabolic fingerprinting as a diagnostic tool. , 2007, Pharmacogenomics.

[4]  Peter de B. Harrington,et al.  Analysis of variance–principal component analysis: A soft tool for proteomic discovery , 2005 .

[5]  Kishore K. Pasikanti,et al.  Noninvasive urinary metabonomic diagnosis of human bladder cancer. , 2010, Journal of proteome research.

[6]  Ofer Nativ,et al.  Detection of bladder cancer in human urine by metabolomic profiling using high performance liquid chromatography/mass spectrometry. , 2008, The Journal of urology.

[7]  Douglas B. Kell,et al.  Statistical strategies for avoiding false discoveries in metabolomics and related experiments , 2007, Metabolomics.

[8]  R. Danielsson,et al.  Multivariate comparison between peptide mass fingerprints obtained by liquid chromatography-electrospray ionization-mass spectrometry with different trypsin digestion procedures. , 2007, Journal of chromatography. A.

[9]  Emilio Marengo,et al.  New approach based on fuzzy logic and principal component analysis for the classification of two-dimensional maps in health and disease. Application to lymphomas. , 2003, Journal of chromatography. A.

[10]  Christophe Junot,et al.  Applications of liquid chromatography coupled to mass spectrometry-based metabolomics in clinical chemistry and toxicology: A review. , 2011, Clinical biochemistry.

[11]  G. Cox The influence of silica structure on reversed-phase retention , 1993 .

[12]  R. Abagyan,et al.  XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. , 2006, Analytical chemistry.

[13]  J. Crowley,et al.  Prevalence of prostate cancer among men with a prostate-specific antigen level < or =4.0 ng per milliliter. , 2004, The New England journal of medicine.

[14]  Rolf Danielsson,et al.  Comparing capillary electrophoresis-mass spectrometry fingerprints of urine samples obtained after intake of coffee, tea, or water. , 2008, Analytical chemistry.

[15]  L. Sillerud,et al.  Citrate concentrations in human seminal fluid and expressed prostatic fluid determined via 1H nuclear magnetic resonance spectroscopy outperform prostate specific antigen in prostate cancer detection. , 2006, The Journal of urology.

[16]  I. García-Pérez,et al.  Metabolic fingerprinting with capillary electrophoresis. , 2008, Journal of chromatography. A.

[17]  B. Hammock,et al.  Mass spectrometry-based metabolomics. , 2007, Mass spectrometry reviews.

[18]  Steffen Neumann,et al.  Critical assessment of alignment procedures for LC-MS proteomics and metabolomics measurements , 2008, BMC Bioinformatics.

[19]  J. Lindberg,et al.  Second-order peak detection for multicomponent high-resolution LC/MS data. , 2006, Analytical chemistry.

[20]  Frans M van der Kloet,et al.  Analytical error reduction using single point calibration for accurate and precise metabolomic phenotyping. , 2009, Journal of proteome research.

[21]  Masaru Tomita,et al.  MathDAMP: a package for differential analysis of metabolite profiles , 2006, BMC Bioinformatics.

[22]  Ralf J. O. Torgrip,et al.  Warping and alignment technologies for inter-sample feature correspondence in 1D H-NMR, chromatography-, and capillary electrophoresis-mass spectrometry data , 2010 .

[23]  Matej Oresic,et al.  Normalization method for metabolomics data using optimal selection of multiple internal standards , 2007, BMC Bioinformatics.

[24]  Arjen Lommen,et al.  MetAlign: interface-driven, versatile metabolomics tool for hyphenated full-scan mass spectrometry data preprocessing. , 2009, Analytical chemistry.

[25]  M. Daszykowski,et al.  No-alignment-strategies for exploring a set of two-way data tables obtained from capillary electrophoresis-mass spectrometry. , 2008, Journal of chromatography. A.

[26]  Guang-Zhong Yang,et al.  Image analysis tools and emerging algorithms for expression proteomics , 2010, Proteomics.

[27]  Benno Schwikowski,et al.  Signal Maps for Mass Spectrometry-based Comparative Proteomics* , 2006, Molecular & Cellular Proteomics.

[28]  Farin Kamangar,et al.  Patterns of cancer incidence, mortality, and prevalence across five continents: defining priorities to reduce cancer disparities in different geographic regions of the world. , 2006, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[29]  Tianwei Yu,et al.  apLCMS - adaptive processing of high-resolution LC/MS data , 2009, Bioinform..

[30]  Per E. Andrén,et al.  Development and Evaluation of Normalization Methods for Label-free Relative Quantification of Endogenous Peptides* , 2009, Molecular & Cellular Proteomics.

[31]  E. D. Crawford PSA testing: what is the use? , 2005, The Lancet.

[32]  Harald Martens,et al.  An improved pixel‐based approach for analyzing images in two‐dimensional gel electrophoresis , 2008, Electrophoresis.

[33]  O. Kvalheim,et al.  A multivariate approach to reveal biomarker signatures for disease classification: application to mass spectral profiles of cerebrospinal fluid from patients with multiple sclerosis. , 2010, Journal of proteome research.

[34]  X. Yao,et al.  Efforts to resolve the contradictions in early diagnosis of prostate cancer: a comparison of different algorithms of sarcosine in urine , 2011, Prostate Cancer and Prostatic Diseases.

[35]  J. Nawrocki,et al.  The silanol group and its role in liquid chromatography , 1997 .

[36]  Steffen Neumann,et al.  Highly sensitive feature detection for high resolution LC/MS , 2008, BMC Bioinformatics.

[37]  Lukas N. Mueller,et al.  SuperHirn – a novel tool for high resolution LC‐MS‐based peptide/protein profiling , 2007, Proteomics.

[38]  Jean-Charles Sanchez,et al.  MSight: An image analysis software for liquid chromatography‐mass spectrometry , 2005, Proteomics.

[39]  B. Warrack,et al.  Normalization strategies for metabonomic analysis of urine samples. , 2009, Journal of chromatography. B, Analytical technologies in the biomedical and life sciences.

[40]  M. Wiener,et al.  Differential mass spectrometry: a label-free LC-MS method for finding significant differences in complex peptide and protein mixtures. , 2004, Analytical chemistry.

[41]  Radford M. Neal,et al.  Difference detection in LC-MS data for protein biomarker discovery , 2007, Bioinform..

[42]  Rolf Danielsson,et al.  Urine profiling using capillary electrophoresis-mass spectrometry and multivariate data analysis. , 2006, Journal of chromatography. A.

[43]  Erik Alm,et al.  The correspondence problem for metabonomics datasets , 2009, Analytical and bioanalytical chemistry.

[44]  Robert W. Field,et al.  Baseline subtraction using robust local regression estimation , 2001 .

[45]  Zengyou He,et al.  Technical, bioinformatical and statistical aspects of liquid chromatography-mass spectrometry (LC-MS) and capillary electrophoresis-mass spectrometry (CE-MS) based clinical proteomics: a critical assessment. , 2009, Journal of chromatography. B, Analytical technologies in the biomedical and life sciences.

[46]  Benno Schwikowski,et al.  Alignment of LC‐MS images, with applications to biomarker discovery and protein identification , 2008, Proteomics.

[47]  Vincent Mazet,et al.  Background removal from spectra by designing and minimising a non-quadratic cost function , 2005 .

[48]  Marek Kimmel,et al.  The sensitivity of bladder wash flow cytometry, bladder wash cytology, and voided cytology in the detection of bladder carcinoma , 1987, Cancer.

[49]  Matej Oresic,et al.  MZmine 2: Modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data , 2010, BMC Bioinformatics.

[50]  Elizabeth Want,et al.  Processing and analysis of GC/LC-MS-based metabolomics data. , 2011, Methods in molecular biology.

[51]  M Daszykowski,et al.  Methods for the exploratory analysis of two-dimensional chromatographic signals. , 2011, Talanta.

[52]  Tarja Rajalahti,et al.  Discriminating variable test and selectivity ratio plot: quantitative tools for interpretation and variable (biomarker) selection in complex spectral or chromatographic profiles. , 2009, Analytical chemistry.

[53]  Emilio Marengo,et al.  Multivariate statistical tools applied to the characterization of the proteomic profiles of two human lymphoma cell lines by two‐dimensional gel electrophoresis , 2006, Electrophoresis.

[54]  R. A. van den Berg,et al.  Centering, scaling, and transformations: improving the biological information content of metabolomics data , 2006, BMC Genomics.

[55]  Johan Lindberg,et al.  Feature detection and alignment of hyphenated chromatographic-mass spectrometric data. Extraction of pure ion chromatograms using Kalman tracking. , 2008, Journal of chromatography. A.

[56]  Liang Cheng,et al.  Bladder cancer: epidemiology, staging and grading, and diagnosis. , 2005, Urology.

[57]  M. Barker,et al.  Partial least squares for discrimination , 2003 .

[58]  M. Orešič,et al.  Data processing for mass spectrometry-based metabolomics. , 2007, Journal of chromatography. A.

[59]  S. D. Jong,et al.  The kernel PCA algorithms for wide data. Part I: Theory and algorithms , 1997 .

[60]  Antoine H P America,et al.  Comparative LC‐MS: A landscape of peaks and valleys , 2008, Proteomics.

[61]  Paul H. C. Eilers,et al.  Improved parametric time warping for proteomics , 2010 .

[62]  Claude C. Grigsby,et al.  Metabolite differentiation and discovery lab (MeDDL): a new tool for biomarker discovery and mass spectral visualization. , 2010, Analytical chemistry.

[63]  John T. Wei,et al.  Metabolomic profiles delineate potential role for sarcosine in prostate cancer progression , 2009, Nature.

[64]  R. Ramautar,et al.  CE‐MS in metabolomics , 2009, Electrophoresis.

[65]  J. Listgarten,et al.  Statistical and Computational Methods for Comparative Proteomic Profiling Using Liquid Chromatography-Tandem Mass Spectrometry , 2005, Molecular & Cellular Proteomics.

[66]  Rolf Danielsson,et al.  Rapid multivariate analysis of LC/GC/CE data (single or multiple channel detection) without prior peak alignment , 2006 .

[67]  Kishore K. Pasikanti,et al.  Gas chromatography/mass spectrometry in metabolic profiling of biological fluids. , 2008, Journal of chromatography. B, Analytical technologies in the biomedical and life sciences.

[68]  Tomoyoshi Soga,et al.  Metabolome analysis by capillary electrophoresis-mass spectrometry. , 2007, Journal of chromatography. A.

[69]  Masaru Tomita,et al.  Differential metabolomics software for capillary electrophoresis-mass spectrometry data analysis , 2010, Metabolomics.

[70]  Chris F. Taylor,et al.  A common open representation of mass spectrometry data and its application to proteomics research , 2004, Nature Biotechnology.

[71]  Holly T. Sullivan,et al.  The Prostate 68 : 620 ^ 628 ( 2008 ) TheMetabolitesCitrate , Myo-Inositol , and Spermine Are PotentialAge-IndependentMarkers of Prostate Cancer inHumanExpressed Prostatic Secretions , 2008 .

[72]  M. Rantalainen,et al.  OPLS discriminant analysis: combining the strengths of PLS‐DA and SIMCA classification , 2006 .