Tranche distributed repository and ProteomeCommons.org.

Tranche is a distributed repository designed to redundantly store and disseminate data sets for the proteomics community. It has several important features for researchers, including support for large data files, prepublication access controls, licensing options, and ensuring both data provenance and integrity. Tranche tightly integrates with ProteomeCommons.org, an online community resource that offers a variety of useful tools for proteomics researchers, including project management and data annotation. In this chapter, we discuss the development of Tranche and ProteomeCommons.org, paying particular attention to why it is desirable that data be publicly available and unrestricted as well as the challenges facing data archiving and open access. We then provide a technical overview of Tranche and ProteomeCommons.org as well as step-by-step instructions for using these resources, including the graphical user interface (GUI ), command-line tools, and Application Programmer Interface (API). We end with a brief discussion of current and future development efforts and collaborations.

[1]  Chris F. Taylor,et al.  Current status of proteomic standards development , 2004, Expert review of proteomics.

[2]  Christopher P Austin,et al.  Prepublication data sharing , 2009, Nature.

[3]  Robertson Craig,et al.  Open source system for analyzing, validating, and storing protein identification data. , 2004, Journal of proteome research.

[4]  John M. Asara,et al.  B . canadensis Campanian Hadrosaur Biomolecular Characterization and Protein Sequences of the , 2012 .

[5]  Chris F. Taylor,et al.  A common open representation of mass spectrometry data and its application to proteomics research , 2004, Nature Biotechnology.

[6]  Dorothea Salo,et al.  Innkeeper at the Roach Motel , 2009, Libr. Trends.

[7]  Natalie Wilson,et al.  Human Protein Reference Database , 2004, Nature Reviews Molecular Cell Biology.

[8]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[9]  Ron Edgar,et al.  NCBI Peptidome: a new public repository for mass spectrometry peptide identifications , 2009, Nature Biotechnology.

[10]  Philip C. Andrews,et al.  A code and data archival and dissemination tool for the proteomics community , 2006 .

[11]  Lennart Martens,et al.  PRIDE: The proteomics identifications database , 2005, Proteomics.

[12]  Helmut E Meyer,et al.  Data handling and processing in proteomics , 2009, Expert review of proteomics.

[13]  J. M. Hancock,et al.  Post-publication sharing of data and tools , 2009, Nature.

[14]  P. Nelson,et al.  From genomics to proteomics: techniques and applications in cancer research. , 2001, Trends in cell biology.

[15]  Winston A Hide,et al.  Big data: The future of biocuration , 2008, Nature.

[16]  Recep Avci,et al.  Analyses of Soft Tissue from Tyrannosaurus rex Suggest the Presence of Protein , 2007, Science.

[17]  H. S. Wiley Why Don't We Share Data? , 2009 .

[18]  Michael G. Tyshenko,et al.  Current trends in publicly available genetic databases , 2005, Health Informatics J..

[19]  P. Bryan Heidorn,et al.  Shedding Light on the Dark Data in the Long Tail of Science , 2008, Libr. Trends.

[20]  E. Deutsch mzML: A single, unifying data format for mass spectrometer output , 2008, Proteomics.

[21]  Henry H. N. Lam,et al.  PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows , 2008, EMBO reports.

[22]  Rudolf Bayer,et al.  Binary B-trees for virtual memory , 1971, SIGFIDET '71.

[23]  Rong Wang,et al.  The need for a public proteomics repository , 2004, Nature Biotechnology.

[24]  Lennart Martens,et al.  The minimum information about a proteomics experiment (MIAPE) , 2007, Nature Biotechnology.