qcML: An Exchange Format for Quality Control Metrics from Mass Spectrometry Experiments

Quality control is increasingly recognized as a crucial aspect of mass spectrometry based proteomics. Several recent papers discuss relevant parameters for quality control and present applications to extract these from the instrumental raw data. What has been missing, however, is a standard data exchange format for reporting these performance metrics. We therefore developed the qcML format, an XML-based standard that follows the design principles of the related mzML, mzIdentML, mzQuantML, and TraML standards from the HUPO-PSI (Proteomics Standards Initiative). In addition to the XML format, we also provide tools for the calculation of a wide range of quality metrics as well as a database format and interconversion tools, so that existing LIMS systems can easily add relational storage of the quality control data to their existing schema. We here describe the qcML specification, along with possible use cases and an illustrative example of the subsequent analysis possibilities. All information about qcML is available at http://code.google.com/p/qcml.

[1]  Lennart Martens,et al.  The Ontology Lookup Service: bigger and better , 2010, Nucleic Acids Res..

[2]  Lennart Martens,et al.  TraML—A Standard Format for Exchange of Selected Reaction Monitoring Transition Lists* , 2011, Molecular & Cellular Proteomics.

[3]  Xu Shi,et al.  Performance characteristics of an FT MS‐based workflow for label‐free differential MS analysis of human plasma: standards, reproducibility, targeted feature investigation, and application to a model of controlled myocardial infarction , 2008, Proteomics. Clinical applications.

[4]  Karl Mechtler,et al.  Interlaboratory studies and initiatives developing standards for proteomics , 2013, Proteomics.

[5]  Martin Eisenacher,et al.  The mzIdentML Data Standard for Mass Spectrometry-Based Proteomics Results , 2012, Molecular & Cellular Proteomics.

[6]  Michel Schneider,et al.  UniProtKB/Swiss-Prot. , 2007, Methods in molecular biology.

[7]  David L. Tabb,et al.  Performance Metrics for Liquid Chromatography-Tandem Mass Spectrometry Systems in Proteomics Analyses* , 2009, Molecular & Cellular Proteomics.

[8]  Jens Krüger,et al.  From the Desktop to the Grid: conversion of KNIME Workflows to gUSE , 2013, IWSG.

[9]  Fredrik Levander,et al.  Data processing methods and quality control strategies for label-free LC-MS protein quantification. , 2014, Biochimica et biophysica acta.

[10]  Paul N. Schofield,et al.  The Units Ontology: a tool for integrating units of measurement in science , 2012, Database J. Biol. Databases Curation.

[11]  Lennart Martens,et al.  Bringing proteomics into the clinic: The need for the field to finally take itself seriously , 2013, Proteomics. Clinical applications.

[12]  Martin Eisenacher,et al.  Controlled vocabularies and ontologies in proteomics: Overview, principles and practice , 2014, Biochimica et biophysica acta.

[13]  Lennart Martens,et al.  ms_lims, a simple yet powerful open source laboratory information management system for MS‐driven proteomics , 2010, Proteomics.

[14]  K. Gevaert,et al.  Improved recovery of proteome‐informative, protein N‐terminal peptides by combined fractional diagonal chromatography (COFRADIC) , 2008, Proteomics.

[15]  David L Tabb,et al.  Quality assessment for clinical proteomics. , 2013, Clinical biochemistry.

[16]  Henning Hermjakob,et al.  Ten Years of Standardizing Proteomic Data: A Report on the HUPO‐PSI Spring Workshop , 2012, Proteomics.

[17]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.

[18]  Lennart Martens,et al.  Quality Control in Proteomics , 2011, Proteomics.

[19]  Martin Eisenacher,et al.  Using Laboratory Information Management Systems as central part of a proteomics data workflow , 2010, Proteomics.

[20]  Knut Reinert,et al.  TOPP - the OpenMS proteomics pipeline , 2007, Bioinform..

[21]  Lennart Martens,et al.  mzML—a Community Standard for Mass Spectrometry Data* , 2010, Molecular & Cellular Proteomics.

[22]  Knut Reinert,et al.  OpenMS – An open-source software framework for mass spectrometry , 2008, BMC Bioinformatics.

[23]  John T. Prince,et al.  Metriculator: quality assessment for mass spectrometry-based proteomics , 2013, Bioinform..

[24]  I. Eidhammer,et al.  Improving the reliability and throughput of mass spectrometry‐based proteomics by spectrum quality filtering , 2006, Proteomics.

[25]  Lennart Martens,et al.  The PSI semantic validator: A framework to check MIAPE compliance of proteomics data , 2009, Proteomics.

[26]  Fredrik Levander,et al.  Automated quality control system for LC-SRM setups. , 2013, Journal of proteomics.

[27]  Thorsten Meinl,et al.  KNIME: The Konstanz Information Miner , 2007, GfKl.

[28]  Lennart Martens,et al.  A posteriori quality control for the curation and reuse of public proteomics data , 2011, Proteomics.

[29]  Martin Eisenacher,et al.  The HUPO proteomics standards initiative- mass spectrometry controlled vocabulary , 2013, Database J. Biol. Databases Curation.

[30]  Lorenzo J. Vega-Montoto,et al.  QuaMeter: multivendor performance metrics for LC-MS/MS proteomics instrumentation. , 2012, Analytical chemistry.

[31]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[32]  M. Haine,et al.  Van Damme A. , 1986 .

[33]  Robert E. Kearney,et al.  A HUPO test sample study reveals common problems in mass spectrometry-based proteomics , 2009, Nature Methods.

[34]  Karl Mechtler,et al.  SIMPATIQCO: A Server-Based Software Suite Which Facilitates Monitoring the Time Course of LC–MS Performance Metrics on Orbitrap Instruments , 2012, Journal of proteome research.

[35]  Martin Eisenacher,et al.  The mzQuantML Data Standard for Mass Spectrometry–based Quantitative Studies in Proteomics , 2013, Molecular & Cellular Proteomics.