OpenMS: a flexible open-source software platform for mass spectrometry data analysis

High-resolution mass spectrometry (MS) has become an important tool in the life sciences, contributing to the diagnosis and understanding of human diseases, elucidating biomolecular structural information and characterizing cellular signaling networks. However, the rapid growth in the volume and complexity of MS data makes transparent, accurate and reproducible analysis difficult. We present OpenMS 2.0 (http://www.openms.de), a robust, open-source, cross-platform software specifically designed for the flexible and reproducible analysis of high-throughput MS data. The extensible OpenMS software implements common mass spectrometric data processing tasks through a well-defined application programming interface in C++ and Python and through standardized open data formats. OpenMS additionally provides a set of 185 tools and ready-made workflows for common mass spectrometric data processing tasks, which enable users to perform complex quantitative mass spectrometric analyses with ease.

[1]  W. Delano The PyMOL Molecular Graphics System , 2002 .

[2]  S. Bryant,et al.  Open mass spectrometry search algorithm. , 2004, Journal of proteome research.

[3]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[4]  J. Harrow,et al.  GENCODE: producing a reference annotation for ENCODE , 2006, Genome Biology.

[5]  William Stafford Noble,et al.  Semi-supervised learning for peptide identification from shotgun proteomics datasets , 2007, Nature Methods.

[6]  Knut Reinert,et al.  OpenMS – An open-source software framework for mass spectrometry , 2008, BMC Bioinformatics.

[7]  Knut Reinert,et al.  SeqAn An efficient, generic C++ library for sequence analysis , 2008, BMC Bioinformatics.

[8]  Thorsten Meinl,et al.  KNIME: The Konstanz Information Miner , 2007, GfKl.

[9]  Robert Burke,et al.  ProteoWizard: open source software for rapid proteomics tools development , 2008, Bioinform..

[10]  Brendan MacLean,et al.  Skyline: an open source document editor for creating and analyzing targeted proteomics experiments , 2010, Bioinform..

[11]  Lennart Martens,et al.  mzML—a Community Standard for Mass Spectrometry Data* , 2010, Molecular & Cellular Proteomics.

[12]  Natalie I. Tasman,et al.  A guided tour of the Trans‐Proteomic Pipeline , 2010, Proteomics.

[13]  P. Pevzner,et al.  The Generating Function of CID, ETD, and CID/ETD Pairs of Tandem Mass Spectra: Applications to Database Search* , 2010, Molecular & Cellular Proteomics.

[14]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[15]  William Stafford Noble,et al.  Efficient marginalization to compute protein posterior probabilities from shotgun mass spectrometry data. , 2010, Journal of proteome research.

[16]  T. Reinheckel,et al.  Contribution of cathepsin L to secretome composition and cleavage pattern of mouse embryonic fibroblasts , 2011, Biological chemistry.

[17]  Lennart Martens,et al.  TraML—A Standard Format for Exchange of Selected Reaction Monitoring Transition Lists* , 2011, Molecular & Cellular Proteomics.

[18]  Knut Reinert,et al.  MSSimulator: Simulation of mass spectrometry data. , 2011, Journal of proteome research.

[19]  O. Kohlbacher,et al.  Probabilistic consensus scoring improves tandem mass spectrometry peptide identification. , 2011, Journal of proteome research.

[20]  Knut Reinert,et al.  TOPPAS: a graphical workflow editor for the analysis of high-throughput proteomics data. , 2012, Journal of proteome research.

[21]  Martin Eisenacher,et al.  The mzIdentML Data Standard for Mass Spectrometry-Based Proteomics Results , 2012, Molecular & Cellular Proteomics.

[22]  Natalie I. Tasman,et al.  A Cross-platform Toolkit for Mass Spectrometry and Proteomics , 2012, Nature Biotechnology.

[23]  Lennart Martens,et al.  PRIDE Inspector: a tool to visualize and validate MS proteomics data , 2011, Nature Biotechnology.

[24]  Ludovic C. Gillet,et al.  Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis* , 2012, Molecular & Cellular Proteomics.

[25]  Eric W. Deutsch,et al.  File Formats Commonly Used in Mass Spectrometry Proteomics* , 2012, Molecular & Cellular Proteomics.

[26]  Martin Eisenacher,et al.  The mzQuantML Data Standard for Mass Spectrometry–based Quantitative Studies in Proteomics , 2013, Molecular & Cellular Proteomics.

[27]  Uwe Schmitt,et al.  eMZed: an open source framework in Python for rapid and interactive development of LC/MS data analysis workflows , 2013, Bioinform..

[28]  Andreas Quandt,et al.  An automated pipeline for high-throughput label-free quantitative proteomics. , 2013, Journal of proteome research.

[29]  Andreas Zell,et al.  Automated Label-free Quantification of Metabolites from Liquid Chromatography–Mass Spectrometry Data* , 2013, Molecular & Cellular Proteomics.

[30]  Lorenz Blum,et al.  Improving the Swiss Grid Proteomics Portal: Requirements and new Features based on Experience and Usability Considerations , 2013, IWSG.

[31]  Lennart Martens,et al.  qcML: An Exchange Format for Quality Control Metrics from Mass Spectrometry Experiments* , 2014, Molecular & Cellular Proteomics.

[32]  Brendan MacLean,et al.  MSstats: an R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments , 2014, Bioinform..

[33]  Jun Fan,et al.  The mzTab Data Exchange Format: Communicating Mass-spectrometry-based Proteomics and Metabolomics Experimental Results to a Wider Audience* , 2014, Molecular & Cellular Proteomics.

[34]  Lars Malmström,et al.  aLFQ: an R-package for estimating absolute protein quantities from label-free LC-MS/MS proteomics data , 2014, Bioinform..

[35]  Lars Malmström,et al.  pyOpenMS: A Python‐based interface to the OpenMS mass‐spectrometry algorithm library , 2014, Proteomics.

[36]  Oliver Kohlbacher,et al.  Photo-cross-linking and high-resolution mass spectrometry for assignment of RNA-binding sites in RNA-binding proteins , 2014, Nature Methods.

[37]  Ben C. Collins,et al.  OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data , 2014, Nature Biotechnology.

[38]  Code share , 2014, Nature.

[39]  Eystein Oveland,et al.  PeptideShaker enables reanalysis of MS-derived proteomics data sets , 2015, Nature Biotechnology.

[40]  Ruedi Aebersold,et al.  Quantitative variability of 342 plasma proteins in a human twin population , 2015 .

[41]  Lars Malmström,et al.  Fast and Efficient XML Data Access for Next-Generation Mass Spectrometry , 2015, PloS one.

[42]  Knut Reinert,et al.  Workflows for automated downstream data analysis and visualization in large-scale computational mass spectrometry , 2015, Proteomics.

[43]  Oliver Schilling,et al.  Toward improved peptide feature detection in quantitative proteomics using stable isotope labeling , 2015, Proteomics. Clinical applications.

[44]  Lars Malmström,et al.  Efficient visualization of high-throughput targeted proteomics experiments: TAPIR , 2015, Bioinform..

[45]  James C. Wright,et al.  Improving GENCODE reference gene annotation using a high-stringency proteogenomics workflow , 2016, Nature Communications.

[46]  Oliver Kohlbacher,et al.  LFQProfiler and RNP(xl): Open-Source Tools for Label-Free Quantification and Protein-RNA Cross-Linking Integrated into Proteome Discoverer. , 2016, Journal of proteome research.

[47]  Marco Y. Hein,et al.  The Perseus computational platform for comprehensive analysis of (prote)omics data , 2016, Nature Methods.

[48]  Niall Boyce Devil in the details , 2016, The Lancet.

[49]  Fabrizio Costa,et al.  Formalin-Fixed, Paraffin-Embedded Tissues (FFPE) as a Robust Source for the Profiling of Native and Protease-Generated Protein Amino Termini* , 2016, Molecular & Cellular Proteomics.

[50]  Nuno A. Fonseca,et al.  Expression Atlas update—an integrated database of gene and protein expression in humans, animals and plants , 2015, Nucleic Acids Res..

[51]  Samuel J Kuzminski,et al.  The Devil Is in the Details. , 2017, Journal of the American College of Radiology : JACR.