Standardizing Proteomics Workflow for Liquid Chromatography-Mass Spectrometry: Technical and Statistical Considerations

Introduction: The quantitative measurements based on liquid chromatography (LC) coupled with mass spectrometry (MS) often suffer from the problem of missing values and data heterogeneity from technical variability. We considered a proteomics data set generated from human kidney biopsy material to investigate the technical effects of sample preparation and the quantitative MS. Methods: We studied the effect of tissue storage methods (TSMs) and tissue extraction methods (TEMs) on data analysis. There are two TSMs: frozen (FR) and FFPE (formalin-fixed paraffin embedded); and three TEMs: MAX, TX followed by MAX and SDS followed by MAX. We assessed the impact of different strategies to analyze the data while considering heterogeneity and MVs. We have used analysis of variance (ANOVA) model to study the effects due to various sources of variability. Results and Conclusion: We found that the FFPE TSM is better than the FR TSM. We also found that the one-step TEM (MAX) is better than those of two-steps TEMs. Furthermore, we found the imputation method is a better approach than excluding the proteins with MVs or using unbalanced design.

[1]  Martin Vingron,et al.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression , 2002, ISMB.

[2]  F. Lisacek,et al.  Animal board invited review: advances in proteomics for animal and food sciences , 2014, Animal : an international journal of animal bioscience.

[3]  Reinhard Schneider,et al.  RepExplore: addressing technical replicate variance in proteomics and metabolomics data analysis , 2015, Bioinform..

[4]  R. Aebersold,et al.  Applying mass spectrometry-based proteomics to genetics, genomics and network biology , 2009, Nature Reviews Genetics.

[5]  Brian T Chait,et al.  Chemistry. Mass spectrometry: bottom-up or top-down? , 2006, Science.

[6]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[7]  N. Anderson,et al.  Proteome and proteomics: New technologies, new concepts, and new words , 1998, Electrophoresis.

[8]  Andrew R. Jones,et al.  ProteomeXchange provides globally co-ordinated proteomics data submission and dissemination , 2014, Nature Biotechnology.

[9]  Y. Hathout Approaches to the study of the cell secretome , 2007, Expert review of proteomics.

[10]  J. Pitt,et al.  Principles and applications of liquid chromatography-mass spectrometry in clinical biochemistry. , 2009, The Clinical biochemist. Reviews.

[11]  Richard D Smith,et al.  Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics. , 2015, Journal of proteome research.

[12]  J. Klein,et al.  Technical note: proteomic approaches to fundamental questions about neutrophil biology , 2013, Journal of leukocyte biology.

[13]  Lukas Käll,et al.  Solution to Statistical Challenges in Proteomics Is More Statistics, Not Less. , 2015, Journal of proteome research.

[14]  M. Mann,et al.  Mass spectrometry–based proteomics turns quantitative , 2005, Nature chemical biology.

[15]  M. Dunn,et al.  Proteomics of the Heart : Unraveling Disease , 2006 .

[16]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[17]  Ruedi Aebersold,et al.  High throughput protein characterization by automated reverse‐phase chromatography/electrospray tandem mass spectrometry , 1998, Protein science : a publication of the Protein Society.

[18]  Trevor Hastie,et al.  Imputing Missing Data for Gene Expression Arrays , 2001 .

[19]  S. Carr,et al.  Overview of peptide and protein analysis by mass spectrometry. , 2001, Current protocols in protein science.

[20]  S. Hanash,et al.  Disease proteomics , 2003, Nature.

[21]  B. Chait Mass Spectrometry: Bottom-Up or Top-Down? , 2006, Science.

[22]  N. Kelleher,et al.  Progress in Top-Down Proteomics and the Analysis of Proteoforms. , 2016, Annual review of analytical chemistry.

[23]  Bernhard Kuster,et al.  Quantitative mass spectrometry in proteomics: critical review update from 2007 to the present , 2012, Analytical and Bioanalytical Chemistry.

[24]  Ronald J. Moore,et al.  Sources of technical variability in quantitative LC-MS proteomics: human brain tissue sample analysis. , 2013, Journal of proteome research.

[25]  Jens M. Rick,et al.  Quantitative mass spectrometry in proteomics: a critical review , 2007, Analytical and bioanalytical chemistry.

[26]  B. Kuster,et al.  Proteomics: a pragmatic perspective , 2010, Nature Biotechnology.

[27]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[28]  Bart Devreese,et al.  A review on recent developments in mass spectrometry instrumentation and quantitative tools advancing bacterial proteomics , 2013, Applied Microbiology and Biotechnology.

[29]  B. Warscheid Mass Spectrometry of Peptides and Proteins , 2008 .

[30]  N. Bykova,et al.  Advances in plant proteomics toward improvement of crop productivity and stress resistancex , 2015, Front. Plant Sci..

[31]  Stefan Pieper,et al.  Liquid chromatography-mass spectrometry-based quantitative proteomics. , 2009, Methods in molecular biology.

[32]  Sanford Weisberg,et al.  An R Companion to Applied Regression , 2010 .

[33]  Richard D. Smith,et al.  Normalization and missing value imputation for label-free LC-MS analysis , 2012, BMC Bioinformatics.

[34]  Marc R Wilkins,et al.  Hares and tortoises: The high‐ versus low‐throughput proteomic race , 2009, Electrophoresis.

[35]  Veit Schwämmle,et al.  Assessment and improvement of statistical tools for comparative proteomics analysis of sparse data sets with few experimental replicates. , 2013, Journal of proteome research.

[36]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[37]  Guilong Cheng,et al.  Mass spectrometry of peptides and proteins. , 2005, Methods.

[38]  Ruedi Aebersold,et al.  Statistical protein quantification and significance analysis in label-free LC-MS experiments with complex designs , 2012, BMC Bioinformatics.

[39]  J. Yates,et al.  Large-scale analysis of the yeast proteome by multidimensional protein identification technology , 2001, Nature Biotechnology.

[40]  Choon Nam Ong,et al.  Enhancement of the capabilities of liquid chromatography-mass spectrometry with derivatization: general principles and applications. , 2011, Mass spectrometry reviews.

[41]  Gunther Schadow,et al.  Protein quantification in label-free LC-MS experiments. , 2009, Journal of proteome research.

[42]  Tao Liu,et al.  Liquid Chromatography-Mass Spectrometry-based Quantitative Proteomics* , 2011, The Journal of Biological Chemistry.

[43]  J. Lippolis,et al.  Utility, limitations, and promise of proteomics in animal science. , 2010, Veterinary immunology and immunopathology.

[44]  M. Merchant,et al.  Characterization of glomerular extracellular matrix by proteomic analysis of laser-captured microdissected glomeruli. , 2017, Kidney international.

[45]  Harald Mischak,et al.  Advances in urinary proteome analysis and biomarker discovery. , 2007, Journal of the American Society of Nephrology : JASN.

[46]  Joachim Selbig,et al.  pcaMethods - a bioconductor package providing PCA methods for incomplete data , 2007, Bioinform..

[47]  W. Gruissem,et al.  Proteomics of model and crop plant species: status, current limitations and strategic advances for crop improvement. , 2013, Journal of proteomics.