A community proposal to integrate proteomics activities in ELIXIR

Computational approaches have been major drivers behind the progress of proteomics in recent years. The aim of this white paper is to provide a framework for integrating computational proteomics into ELIXIR in the near future, and thus to broaden the portfolio of omics technologies supported by this European distributed infrastructure. This white paper is the direct result of a strategy meeting on ‘The Future of Proteomics in ELIXIR’ that took place in March 2017 in Tübingen (Germany), and involved representatives of eleven ELIXIR nodes. These discussions led to a list of priority areas in computational proteomics that would complement existing activities and close gaps in the portfolio of tools and services offered by ELIXIR so far. We provide some suggestions on how these activities could be integrated into ELIXIR’s existing platforms, and how it could lead to a new ELIXIR use case in proteomics. We also highlight connections to the related field of metabolomics, where similar activities are ongoing. This white paper could thus serve as a starting point for the integration of computational proteomics into ELIXIR. Over the next few months we will be working closely with all stakeholders involved, and in particular with other representatives of the proteomics community, to further refine this paper.

[1]  Silvio C. E. Tosatto,et al.  Tools and data services registry: a community effort to document bioinformatics resources , 2015, Nucleic Acids Res..

[2]  Ruedi Aebersold,et al.  Mass-spectrometric exploration of proteome structure and function , 2016, Nature.

[3]  Juan Antonio Vizcaíno,et al.  The ProteomeXchange consortium in 2017: supporting the cultural change in proteomics public data deposition , 2016, Nucleic Acids Res..

[4]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[5]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[6]  Juan Antonio Vizcaíno,et al.  The proBAM and proBed standard formats: enabling a seamless integration of genomics and proteomics data , 2017, Genome Biology.

[7]  Barend Mons,et al.  The Dutch Techcentre for Life Sciences: Enabling data-intensive life science research in the Netherlands , 2015, F1000Research.

[8]  Espen Mikal Robertsen,et al.  ELIXIR pilot action: Marine metagenomics – towards a domain specific set of sustainable services , 2017, F1000Research.

[9]  G. von Heijne,et al.  Tissue-based map of the human proteome , 2015, Science.

[10]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[11]  Hans J. C. T. Wessels,et al.  Integrated Chemometrics and Statistics to Drive Successful Proteomics Biomarker Discovery , 2018, Proteomes.

[12]  Lennart Martens,et al.  A Golden Age for Working with Public Proteomics Data , 2017, Trends in biochemical sciences.

[13]  K. Reinert,et al.  OpenMS: a flexible open-source software platform for mass spectrometry data analysis , 2016, Nature Methods.

[14]  Amos Bairoch,et al.  neXtProt: a knowledge platform for human proteins , 2011, Nucleic Acids Res..

[15]  Lennart Martens,et al.  compomics-utilities: an open-source Java library for computational proteomics , 2011, BMC Bioinformatics.

[16]  B. Kuster,et al.  Proteomics: a pragmatic perspective , 2010, Nature Biotechnology.

[17]  Charles E. Cook,et al.  Identifying ELIXIR Core Data Resources , 2016, F1000Research.

[18]  D. Fenyö,et al.  Proteogenomics from a bioinformatics angle: A growing field. , 2015, Mass spectrometry reviews.

[19]  J. Vizcaíno,et al.  Exploring the potential of public proteomics data , 2015, Proteomics.

[20]  Samuel H Payne,et al.  ABRF Proteome Informatics Research Group (iPRG) 2016 Study: Inferring Proteoforms from Bottom-up Proteomics Data. , 2018, Journal of biomolecular techniques : JBT.

[21]  Karl Mechtler,et al.  Proceedings of the EuBIC Winter School 2017. , 2017, Journal of proteomics.

[22]  Martin Eisenacher,et al.  Development of data representation standards by the human proteome organization proteomics standards initiative , 2015, J. Am. Medical Informatics Assoc..

[23]  Karin M. Verspoor,et al.  The Dutch Techcentre for Life Sciences: Enabling data-intensive life science research in the Netherlands. , 2015, F1000Research.

[24]  Lloyd M. Smith,et al.  Proteoform: a single term describing protein complexity , 2013, Nature Methods.

[25]  M. Mann,et al.  MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification , 2008, Nature Biotechnology.

[26]  Eystein Oveland,et al.  PeptideShaker enables reanalysis of MS-derived proteomics data sets , 2015, Nature Biotechnology.

[27]  José A. Dianes,et al.  2016 update of the PRIDE database and its related tools , 2016, Nucleic Acids Res..