Enhanced differential expression statistics for data-independent acquisition proteomics

We describe a new reproducibility-optimization method ROPECA for statistical analysis of proteomics data with a specific focus on the emerging data-independent acquisition (DIA) mass spectrometry technology. ROPECA optimizes the reproducibility of statistical testing on peptide-level and aggregates the peptide-level changes to determine differential protein-level expression. Using a ‘gold standard’ spike-in data and a hybrid proteome benchmark data we show the competitive performance of ROPECA over conventional protein-based analysis as well as state-of-the-art peptide-based tools especially in DIA data with consistent peptide measurements. Furthermore, we also demonstrate the improved accuracy of our method in clinical studies using proteomics data from a longitudinal human twin study.

[1]  T. Mizoue,et al.  Circulating adiponectin levels and risk of type 2 diabetes in the Japanese , 2014, Nutrition & Diabetes.

[2]  Brendan MacLean,et al.  MSstats: an R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments , 2014, Bioinform..

[3]  Ruedi Aebersold,et al.  Quantitative variability of 342 plasma proteins in a human twin population , 2015 .

[4]  Davide Heller,et al.  STRING v10: protein–protein interaction networks, integrated over the tree of life , 2014, Nucleic Acids Res..

[5]  Richard J. Cleary Handbook of Beta Distribution and Its Applications , 2006 .

[6]  Peter Willett,et al.  What is a tutorial , 2013 .

[7]  Chih-Chiang Tsou,et al.  DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics , 2015, Nature Methods.

[8]  Markus Müller,et al.  Processing strategies and software solutions for data‐independent acquisition in mass spectrometry , 2015, Proteomics.

[9]  B. Dahlbäck,et al.  Apolipoprotein M predicts pre‐β‐HDL formation: studies in type 2 diabetic and nondiabetic subjects , 2009, Journal of internal medicine.

[10]  Stephen J. Callister,et al.  Normalization approaches for removing systematic biases associated with mass spectrometry and label-free proteomics. , 2006, Journal of proteome research.

[11]  Ruedi Aebersold,et al.  SWATH2stats: An R/Bioconductor Package to Process and Convert Quantitative SWATH-MS Proteomics Data for Downstream Analysis Tools , 2016, PloS one.

[12]  K. Eriksson,et al.  Complement C3 is a risk factor for the development of diabetes: a population-based cohort study. , 2005, Diabetes.

[13]  Ruedi Aebersold,et al.  Spectronaut A fast and efficient algorithm for MRM-like processing of data independent acquisition (SWATH-MS) data , 2012 .

[14]  Lars Malmström,et al.  DIANA - algorithmic improvements for analysis of data-independent acquisition MS data , 2015, Bioinform..

[15]  Archana S. Bembde A Study of Plasma Fibrinogen Level in Type-2 Diabetes Mellitus and its Relation to Glycemic Control , 2012, Indian Journal of Hematology and Blood Transfusion.

[16]  Wen-xiang Wang,et al.  Polysaccharides from Enteromorpha prolifera Improve Glucose Metabolism in Diabetic Rats , 2015, Journal of diabetes research.

[17]  Tomi Suomi,et al.  Optimization of Statistical Methods Impact on Quantitative Proteomics Data. , 2015, Journal of proteome research.

[18]  Ludovic C. Gillet,et al.  Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis* , 2012, Molecular & Cellular Proteomics.

[19]  R. Aebersold,et al.  Selected reaction monitoring for quantitative proteomics: a tutorial , 2008, Molecular systems biology.

[20]  Jarrett D. Egertson,et al.  Multiplexed MS/MS for Improved Data Independent Acquisition , 2013, Nature Methods.

[21]  Stefan Tenzer,et al.  Drift time-specific collision energies enable deep-coverage data-independent acquisition proteomics , 2013, Nature Methods.

[22]  Stefan Tenzer,et al.  In‐depth evaluation of software tools for data‐independent acquisition based label‐free quantification , 2015, Proteomics.

[23]  Y. Liu,et al.  Single nucleotide polymorphisms in the proximal promoter region of apolipoprotein M gene (apoM) confer the susceptibility to development of type 2 diabetes in Han Chinese , 2007, Diabetes/metabolism research and reviews.

[24]  Hyungwon Choi,et al.  mapDIA: Preprocessing and statistical analysis of quantitative proteomics data from data independent acquisition mass spectrometry. , 2015, Journal of proteomics.

[25]  M. Gorenstein,et al.  Absolute Quantification of Proteins by LCMSE , 2006, Molecular & Cellular Proteomics.

[26]  Olli S Nevalainen,et al.  Using Peptide-Level Proteomics Data for Detecting Differentially Expressed Proteins. , 2015, Journal of proteome research.

[27]  Кпсс,et al.  Первая конференция военных и боевых организаций РСДРП. Ноябрь 1906 год , 1932 .

[28]  A. J. Markvoort,et al.  Proteomic Analysis in Type 2 Diabetes Patients before and after a Very Low Calorie Diet Reveals Potential Disease State and Intervention Specific Biomarkers , 2014, PloS one.

[29]  S. Riaz Study of Protein Biomarkers of Diabetes Mellitus Type 2 and Therapy with Vitamin B1 , 2015, Journal of diabetes research.

[30]  Oliver M. Bernhardt,et al.  Extending the Limits of Quantitative Proteome Profiling with Data-Independent Acquisition and Application to Acetaminophen-Treated Three-Dimensional Liver Microtissues* , 2015, Molecular & Cellular Proteomics.

[31]  R. Aebersold,et al.  Selected reaction monitoring–based proteomics: workflows, potential, pitfalls and future directions , 2012, Nature Methods.

[32]  K. Eriksson,et al.  Complement C 3 Is a Risk Factor for the Development of Diabetes A Population-Based Cohort Study , 2005 .

[33]  L.L. Elo,et al.  Reproducibility-Optimized Test Statistic for Ranking Genes in Microarray Studies , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[34]  Navdeep Jaitly,et al.  DAnTE: a statistical tool for quantitative analysis of -omics data , 2008, Bioinform..