Detecting Significant Changes in Protein Abundance.

We review and demonstrate how an empirical Bayes method, shrinking a protein's sample variance towards a pooled estimate, leads to far more powerful and stable inference to detect significant changes in protein abundance compared to ordinary t-tests. Using examples from isobaric mass labeled proteomic experiments we show how to analyze data from multiple experiments simultaneously, and discuss the effects of missing data on the inference. We also present easy to use open source software for normalization of mass spectrometry data and inference based on moderated test statistics.

[1]  J. Robben,et al.  Treatment of missing values for multivariate statistical analysis of gel‐based proteomics data , 2008, Proteomics.

[2]  J. Yates,et al.  A model for random sampling and estimation of relative protein abundance in shotgun proteomics. , 2004, Analytical chemistry.

[3]  Kathryn S. Lilley,et al.  MSnbase-an R/Bioconductor package for isobaric tagged mass spectrometry data visualization, processing and quantitation , 2012, Bioinform..

[4]  Gilbert S Omenn,et al.  Differential protein expression profiling by iTRAQ-2DLC-MS/MS of lung cancer cells undergoing epithelial-mesenchymal transition reveals a migratory/invasive phenotype. , 2006, Journal of proteome research.

[5]  Reinhard Guthke,et al.  Missing values in gel‐based proteomics , 2010, Proteomics.

[6]  John D. Storey The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .

[7]  Elizabeth G. Hill,et al.  iQuantitator: A tool for protein expression inference using iTRAQ , 2009, BMC Bioinformatics.

[8]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Karl Mechtler,et al.  General statistical modeling of data from protein relative expression isobaric tags. , 2011, Journal of proteome research.

[10]  Olga Vitek,et al.  Getting Started in Computational Mass Spectrometry–Based Proteomics , 2009, PLoS Comput. Biol..

[11]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[12]  Adam A. Margolin,et al.  Empirical Bayes Analysis of Quantitative Proteomics Experiments , 2009, PloS one.

[13]  J. Schafer Multiple imputation: a primer , 1999, Statistical methods in medical research.

[14]  Jean Yee Hwa Yang,et al.  OCAP: an open comprehensive analysis pipeline for iTRAQ , 2012, Bioinform..

[15]  Ning Zhang,et al.  Corra: Computational framework and tools for LC-MS discovery and targeted mass spectrometry-based proteomics , 2008, BMC Bioinformatics.

[16]  Marco Grzegorczyk,et al.  Statistics for Proteomics: A Review of Tools for Analyzing Experimental Data , 2006, Proteomics.

[17]  D. Rubin Multiple Imputation After 18+ Years , 1996 .

[18]  T. Therneau,et al.  A statistical model for iTRAQ data analysis. , 2008, Journal of proteome research.

[19]  Terry M Therneau,et al.  Statistical analysis of relative labeled mass spectrometry data from complex samples using ANOVA. , 2008, Journal of proteome research.

[20]  Trong Khoa Pham,et al.  Isobaric tags for relative and absolute quantitation (iTRAQ) reproducibility: Implication of multiple injections. , 2006, Journal of proteome research.

[21]  Lily Ting,et al.  Normalization and Statistical Analysis of Quantitative Proteomics Data Generated by Metabolic Labeling* , 2009, Molecular & Cellular Proteomics.

[22]  Ingo Ruczinski,et al.  Statistical inference from multiple iTRAQ experiments without using common reference standards. , 2013, Journal of proteome research.

[23]  John D. Storey A direct approach to false discovery rates , 2002 .

[24]  R. Lathrop,et al.  iCODA: RNAi-based inducible knock-in system in Trypanosoma brucei. , 2011, Methods in molecular biology.

[25]  Trong Khoa Pham,et al.  Technical, experimental, and biological variations in isobaric tags for relative and absolute quantitation (iTRAQ). , 2007, Journal of proteome research.

[26]  Veit Schwämmle,et al.  Assessment and improvement of statistical tools for comparative proteomics analysis of sparse data sets with few experimental replicates. , 2013, Journal of proteome research.

[27]  Laurent Gatto,et al.  Using R and Bioconductor for proteomics data analysis. , 2013, Biochimica et biophysica acta.

[28]  Brendan MacLean,et al.  MSstats: an R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments , 2014, Bioinform..

[29]  Hua Tang,et al.  Normalization Regarding Non-Random Missing Values in High-Throughput Mass Spectrometry Data , 2005, Pacific Symposium on Biocomputing.

[30]  A. Zick,et al.  Trypanosomes have six mitochondrial DNA helicases with one controlling kinetoplast maxicircle replication. , 2009, Molecular cell.

[31]  Benno Schwikowski,et al.  Assessing Bias in Experiment Design for Large Scale Mass Spectrometry-based Quantitative Proteomics*S , 2007, Molecular & Cellular Proteomics.

[32]  Richard D. Smith,et al.  Normalization and missing value imputation for label-free LC-MS analysis , 2012, BMC Bioinformatics.

[33]  Wenjie Zhu,et al.  The Application of SILAC Mouse in Human Body Fluid Proteomics Analysis Reveals Protein Patterns Associated with IgA Nephropathy , 2013, Evidence-based complementary and alternative medicine : eCAM.

[34]  G. Churchill,et al.  Experimental design for gene expression microarrays. , 2001, Biostatistics.

[35]  Natalie I. Tasman,et al.  A Cross-platform Toolkit for Mass Spectrometry and Proteomics , 2012, Nature Biotechnology.

[36]  K. Parker,et al.  Multiplexed Protein Quantitation in Saccharomyces cerevisiae Using Amine-reactive Isobaric Tagging Reagents*S , 2004, Molecular & Cellular Proteomics.

[37]  Crispin J. Miller,et al.  Eight-channel iTRAQ Enables Comparison of the Activity of Six Leukemogenic Tyrosine Kinases*S , 2008, Molecular & Cellular Proteomics.

[38]  P. T. Englund,et al.  Identification of a Bacterial-Like HslVU Protease in the Mitochondria of Trypanosoma brucei and Its Role in Mitochondrial DNA Replication , 2008, PLoS pathogens.

[39]  George E. P. Box,et al.  Empirical Model‐Building and Response Surfaces , 1988 .

[40]  Gary A. Churchill,et al.  Analysis of Variance for Gene Expression Microarray Data , 2000, J. Comput. Biol..

[41]  Warren Kaplan,et al.  Proteomic comparison of colorectal tumours and non-neoplastic mucosa from paired patient samples using iTRAQ mass spectrometry. , 2011, Molecular bioSystems.

[42]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[43]  Asima Bibi,et al.  Adaption of the global test idea to proteomics data with missing values , 2014, Bioinform..

[44]  J. Rice Mathematical Statistics and Data Analysis , 1988 .

[45]  Ronald J. Moore,et al.  Reversed‐phase chromatography with multiple fraction concatenation strategy for proteome profiling of human MCF10A cells , 2011, Proteomics.

[46]  Olga Vitek,et al.  Computational Mass Spectrometry–Based Proteomics , 2011, PLoS Comput. Biol..