IsoProt: A Complete and Reproducible Workflow To Analyze iTRAQ/TMT Experiments

Reproducibility has become a major concern in biomedical research. In proteomics, bioinformatic workflows can quickly consist of multiple software tools each with its own set of parameters. Their usage involves the definition of often hundreds of parameters as well as data operations to ensure tool interoperability. Hence, a manuscript’s methods section is often insufficient to completely describe and reproduce a data analysis workflow. Here we present IsoProt: A complete and reproducible bioinformatic workflow deployed on a portable container environment to analyze data from isobarically labeled, quantitative proteomics experiments. The workflow uses only open source tools and provides a user-friendly and interactive browser interface to configure and execute the different operations. Once the workflow is executed, the results including the R code to perform statistical analyses can be downloaded as an HTML document providing a complete record of the performed analyses. IsoProt therefore represents a reproducible bioinformatics workflow that will yield identical results on any computer platform.

[1]  M. Arntzen,et al.  IsobariQ: software for isobaric quantitative proteomics using IPTL, iTRAQ, and TMT. , 2011, Journal of proteome research.

[2]  Stephan M. Winkler,et al.  MS Amanda, a Universal Identification Algorithm Optimized for High Accuracy Tandem Mass Spectra , 2014, Journal of proteome research.

[3]  Wout Bittremieux,et al.  Proceedings of the EuBIC developer's meeting 2018. , 2018, Journal of proteomics.

[4]  K. Parker,et al.  Multiplexed Protein Quantitation in Saccharomyces cerevisiae Using Amine-reactive Isobaric Tagging Reagents*S , 2004, Molecular & Cellular Proteomics.

[5]  Adrian G. Barnett,et al.  Industry is more alarmed about reproducibility than academia , 2018, Nature.

[6]  Xiao Zou,et al.  MilQuant: a free, generic software tool for isobaric tagging-based quantitation. , 2012, Journal of proteomics.

[7]  K. Reinert,et al.  OpenMS: a flexible open-source software platform for mass spectrometry data analysis , 2016, Nature Methods.

[8]  Karl Mechtler,et al.  General statistical modeling of data from protein relative expression isobaric tags. , 2011, Journal of proteome research.

[9]  M. Mann,et al.  MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification , 2008, Nature Biotechnology.

[10]  Konstantinos Vougas,et al.  Comparative Analysis of Label-Free and 8-Plex iTRAQ Approach for Quantitative Tissue Proteomic Analysis , 2015, PloS one.

[11]  Ravali Adusumilli,et al.  Data Conversion with ProteoWizard msConvert. , 2017, Methods in molecular biology.

[12]  Qiang Feng,et al.  IQuant: An automated pipeline for quantitative proteomics based upon isobaric tags , 2014, Proteomics.

[13]  Steve Pettifer,et al.  EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats , 2013, Bioinform..

[14]  Harald Barsnes,et al.  SearchGUI: A Highly Adaptable Common Interface for Proteomics Search and de Novo Engines. , 2018, Journal of proteome research.

[15]  Andrew H. Thompson,et al.  Tandem mass tags: a novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. , 2003, Analytical chemistry.

[16]  Harald Barsnes,et al.  BioContainers: an open-source and community-driven framework for software standardization , 2017, Bioinform..

[17]  Kathryn S. Lilley,et al.  MSnbase-an R/Bioconductor package for isobaric tagged mass spectrometry data visualization, processing and quantitation , 2012, Bioinform..

[18]  Bernhard Y. Renard,et al.  iPQF: a new peptide-to-protein summarization method using peptide spectra characteristics to improve protein quantification , 2016, Bioinform..

[19]  Amy Cohen,et al.  Exploring experimental cerebral malaria pathogenesis through the characterisation of host-derived plasma microparticle protein content , 2016, Scientific Reports.

[20]  P. Pevzner,et al.  The Generating Function of CID, ETD, and CID/ETD Pairs of Tandem Mass Spectra: Applications to Database Search* , 2010, Molecular & Cellular Proteomics.

[21]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[22]  Eystein Oveland,et al.  PeptideShaker enables reanalysis of MS-derived proteomics data sets , 2015, Nature Biotechnology.

[23]  Harry Yang,et al.  Statistical Models for the Analysis of Isobaric Tags Multiplexed Quantitative Proteomics. , 2017, Journal of proteome research.

[24]  T. Köcher,et al.  Universal and confident phosphorylation site localization using phosphoRS. , 2011, Journal of proteome research.

[25]  Laurent Gatto,et al.  Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies. , 2016, Journal of proteome research.

[26]  Marco Y. Hein,et al.  The Perseus computational platform for comprehensive analysis of (prote)omics data , 2016, Nature Methods.