论文信息 - Homology-driven assembly of NOn-redundant protEin sequence sets (NOmESS) for mass spectrometry

Homology-driven assembly of NOn-redundant protEin sequence sets (NOmESS) for mass spectrometry

Summary: To enable mass spectrometry (MS)-based proteomic studies with poorly characterized organisms, we developed a computational workflow for the homology-driven assembly of a non-redundant reference sequence dataset. In the automated pipeline, translated DNA sequences (e.g. ESTs, RNA deep-sequencing data) are aligned to those of a closely related and fully sequenced organism. Representative sequences are derived from each cluster and joined, resulting in a non-redundant reference set representing the maximal available amino acid sequence information for each protein. We here applied NOmESS to assemble a reference database for the widely used model organism Xenopus laevis and demonstrate its use in proteomic applications. Availability and implementation: NOmESS is written in C#. The source code as well as the executables can be downloaded from http://www.biochem.mpg.de/cox. Execution of NOmESS requires BLASTp and cd-hit in addition. Contact: cox@biochem.mpg.de Supplementary information: Supplementary data are available at Bioinformatics online.

Jürgen Cox | Matthias Mann | Tikira Temu | Markus Räschle

[1] Adam Godzik,et al. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[2] Gregory Butler,et al. OrfPredictor: predicting protein-coding regions in EST-derived sequences , 2005, Nucleic Acids Res..

[3] Jorng-Tzong Horng,et al. SpliceInfo: an information repository for mRNA alternative splicing in human genome , 2004, Nucleic Acids Res..

[4] A. Hughes,et al. Evolution of duplicate genes in a tetraploid animal, Xenopus laevis. , 1993, Molecular biology and evolution.

[5] Joshua Fortriede,et al. Xenbase, the Xenopus model organism database; new virtualized system, data types and genomes , 2014, Nucleic Acids Res..

[6] D. D. Jones,et al. Triplet nucleotide removal at random positions in a target gene: the tolerance of TEM-1 β-lactamase to an amino acid deletion , 2005, Nucleic acids research.