TOPPAS: a graphical workflow editor for the analysis of high-throughput proteomics data.

Mass spectrometry coupled to high-performance liquid chromatography (HPLC-MS) is evolving more quickly than ever. A wide range of different instrument types and experimental setups are commonly used. Modern instruments acquire huge amounts of data, thus requiring tools for an efficient and automated data analysis. Most existing software for analyzing HPLC-MS data is monolithic and tailored toward a specific application. A more flexible alternative consists of pipeline-based tool kits allowing the construction of custom analysis workflows from small building blocks, e.g., the Trans Proteomics Pipeline (TPP) or The OpenMS Proteomics Pipeline (TOPP). One drawback, however, is the hurdle of setting up complex workflows using command line tools. We present TOPPAS, The OpenMS Proteomics Pipeline ASsistant, a graphical user interface (GUI) for rapid composition of HPLC-MS analysis workflows. Workflow construction reduces to simple drag-and-drop of analysis tools and adding connections in between. Integration of external tools into these workflows is possible as well. Once workflows have been developed, they can be deployed in other workflow management systems or batch processing systems in a fully automated fashion. The implementation is portable and has been tested under Windows, Mac OS X, and Linux. TOPPAS is open-source software and available free of charge at http://www.OpenMS.de/TOPPAS .

[1]  Martin Eisenacher,et al.  mzIdentML: an open community-built standard format for the results of proteomics spectrum identification algorithms. , 2011, Methods in molecular biology.

[2]  Pierre Tufféry,et al.  BIOINFORMATICS ORIGINAL PAPER , 2022 .

[3]  Knut Reinert,et al.  OpenMS – An open-source software framework for mass spectrometry , 2008, BMC Bioinformatics.

[4]  Robert Giegerich,et al.  Conveyor: a worko w engine for bioinformatic analyses , 2011 .

[5]  Lennart Martens,et al.  TraML—A Standard Format for Exchange of Selected Reaction Monitoring Transition Lists* , 2011, Molecular & Cellular Proteomics.

[6]  Lennart Martens,et al.  mzML—a Community Standard for Mass Spectrometry Data* , 2010, Molecular & Cellular Proteomics.

[7]  Michael Specht,et al.  Proteomics to go: Proteomatic enables the user-friendly creation of versatile MS/MS data evaluation workflows , 2011, Bioinform..

[8]  S. Bryant,et al.  Open mass spectrometry search algorithm. , 2004, Journal of proteome research.

[9]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[10]  R. Aebersold,et al.  A uniform proteomics MS/MS analysis platform utilizing open XML file formats , 2005, Molecular systems biology.

[11]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[12]  Rainer Breitling,et al.  msCompare: A Framework for Quantitative Analysis of Label-free LC-MS Data for Comparative Candidate Biomarker Studies* , 2012, Molecular & Cellular Proteomics.

[13]  Daniel J. Blankenberg,et al.  Galaxy: A Web‐Based Genome Analysis Tool for Experimentalists , 2010, Current protocols in molecular biology.

[14]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[15]  Edward A. Lee,et al.  CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2000; 00:1–7 Prepared using cpeauth.cls [Version: 2002/09/19 v2.02] Taverna: Lessons in creating , 2022 .

[16]  R. Aebersold,et al.  A statistical model for identifying proteins by tandem mass spectrometry. , 2003, Analytical chemistry.

[17]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[18]  Bernd Rinn,et al.  openBIS: a flexible framework for managing and analyzing complex data in biology research , 2011, BMC Bioinformatics.

[19]  Magnus Palmblad,et al.  Scientific Workflow Management in Proteomics , 2012, Molecular & Cellular Proteomics.

[20]  Knut Reinert,et al.  TOPP - the OpenMS proteomics pipeline , 2007, Bioinform..

[21]  Carole A. Goble,et al.  Taverna: a tool for building and running workflows of services , 2006, Nucleic Acids Res..

[22]  Thorsten Meinl,et al.  KNIME: The Konstanz Information Miner , 2007, GfKl.

[23]  Robert Burke,et al.  ProteoWizard: open source software for rapid proteomics tools development , 2008, Bioinform..

[24]  Thorsten Meinl,et al.  KNIME - the Konstanz information miner: version 2.0 and beyond , 2009, SKDD.

[25]  Bertram Ludäscher,et al.  Kepler: an extensible system for design and execution of scientific workflows , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[26]  Daniel J. Blankenberg,et al.  Galaxy: a platform for interactive large-scale genome analysis. , 2005, Genome research.