Management and dissemination of MS proteomic data with PROTICdb: Example of a quantitative comparison between methods of protein extraction

High throughput MS‐based proteomic experiments generate large volumes of complex data and necessitate bioinformatics tools to facilitate their handling. Needs include means to archive data, to disseminate them to the scientific communities, and to organize and annotate them to facilitate their interpretation. We present here an evolution of PROTICdb, a database software that now handles MS data, including quantification. PROTICdb has been developed to be as independent as possible from tools used to produce the data. Biological samples and proteomics data are described using ontology terms. A Taverna workflow is embedded, thus permitting to automatically retrieve information related to identified proteins by querying external databases. Stored data can be displayed graphically and a “Query Builder” allows users to make sophisticated queries without knowledge on the underlying database structure. All resources can be accessed programmatically using a Java client API or RESTful web services, allowing the integration of PROTICdb in any portal. An example of application is presented, where proteins extracted from a maize leaf sample by four different methods were compared using a label‐free shotgun method. Data are available at http://moulon.inra.fr/protic/public. PROTICdb thus provides means for data storage, enrichment, and dissemination of proteomics data.

[1]  Hiren J. Joshi MASCP Gator: an aggregation portal for the visualization of Arabidopsis proteomics data , 2012 .

[2]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[3]  Arek Kasprzyk,et al.  BioMart: driving a paradigm change in biological data management , 2011, Database J. Biol. Databases Curation.

[4]  Lennart Martens,et al.  The Protein Identifier Cross-Referencing (PICR) service: reconciling protein identifiers across multiple source databases , 2007, BMC Bioinformatics.

[5]  Robert Schmidt,et al.  PhosPhAt: the Arabidopsis thaliana phosphorylation site database. An update , 2009, Nucleic Acids Res..

[6]  M. Ashburner,et al.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration , 2007, Nature Biotechnology.

[7]  Eric W. Deutsch,et al.  File Formats Commonly Used in Mass Spectrometry Proteomics* , 2012, Molecular & Cellular Proteomics.

[8]  Carole A. Goble,et al.  Taverna: a tool for building and running workflows of services , 2006, Nucleic Acids Res..

[9]  Wolfram Weckwerth,et al.  ProMEX – a mass spectral reference database for plant proteomics , 2012, Front. Plant Sci..

[10]  Junjun Zhang,et al.  BioMart: a data federation framework for large collaborative projects , 2011, Database J. Biol. Databases Curation.

[11]  Robert Burke,et al.  ProteoWizard: open source software for rapid proteomics tools development , 2008, Bioinform..

[12]  V. Méchin,et al.  Total protein extraction with TCA-acetone. , 2007, Methods in molecular biology.

[13]  Joachim Klose,et al.  PROTEOMER: A workflow‐optimized laboratory information management system for 2‐D electrophoresis‐centered proteomics , 2009, Proteomics.

[14]  Johann Joets,et al.  PROTICdb: A web‐based application to store, track, query, and compare plant proteome data , 2005, Proteomics.

[15]  Lennart Martens,et al.  PRIDE: The proteomics identifications database , 2005, Proteomics.

[16]  Qiuming Yao,et al.  P3DB: An Integrated Database for Plant Protein Phosphorylation , 2012, Front. Plant Sci..

[17]  Rolf Apweiler,et al.  InterProScan: protein domains identifier , 2005, Nucleic Acids Res..

[18]  Qi Sun,et al.  PPDB, the Plant Proteomics Database at Cornell , 2008, Nucleic Acids Res..

[19]  Luisa Montecchi-Palazzi,et al.  The PSI-MOD community standard for representation of protein modification data , 2008, Nature Biotechnology.

[20]  Ron D. Appel,et al.  Current status of the SWISS-2DPAGE database , 1998, Nucleic Acids Res..

[21]  Lincoln Stein,et al.  The Plant Ontology Database: a community resource for plant structure and developmental stages controlled vocabulary and annotations , 2008, Nucleic Acids Res..

[22]  Olivier Langella,et al.  MassChroQ: A versatile tool for mass spectrometry quantification , 2011, Proteomics.

[23]  N. Rolland,et al.  AT_CHLORO: A Chloroplast Protein Database Dedicated to Sub-Plastidial Localization , 2012, Front. Plant Sci..

[24]  Natalie I. Tasman,et al.  A guided tour of the Trans‐Proteomic Pipeline , 2010, Proteomics.

[25]  Johann Joets,et al.  The PROTICdb database for 2-DE proteomics. , 2007, Methods in molecular biology.

[26]  Frédérique Lisacek,et al.  The World-2DPAGE Constellation to promote and publish gel-based proteomics data through the ExPASy server. , 2008, Journal of proteomics.

[27]  T. Ganesan,et al.  A simple and efficient method for processing of cell lysates for two‐dimensional gel electrophoresis , 2010, Electrophoresis.

[28]  W. Konigsberg,et al.  Removal of sodium dodecyl sulfate from proteins by ion-pair extraction. , 1983, Methods in enzymology.

[29]  Pierre-Alain Binz,et al.  The Make 2D‐DB II package: Conversion of federated two‐dimensional gel electrophoresis databases into a relational format and interconnection of distributed databases , 2003, Proteomics.

[30]  Matthias Mann,et al.  Analysis of High Accuracy, Quantitative Proteomics Data in the MaxQB Database , 2012, Molecular & Cellular Proteomics.

[31]  Chris F. Taylor,et al.  The work of the Human Proteome Organisation's Proteomics Standards Initiative (HUPO PSI). , 2006, Omics : a journal of integrative biology.