Developing a bioinformatics framework for proteogenomics

In the last 15 years, since the human genome was first sequenced, genome sequencing and annotation have continued to improve. However, genome annotation has not kept up with the accelerating rate of genome sequencing and as a result there is now a large backlog of genomic data waiting to be interpreted both quickly and accurately. Through advances in proteomics a new field has emerged to help improve genome annotation, termed proteogenomics, which uses peptide mass spectrometry data, enabling the discovery of novel protein coding genes, as well as the refinement and validation of known and putative protein-coding genes. The annotation of genomes relies heavily on ab initio gene prediction programs and/or mapping of a range of RNA transcripts. Although this method provides insights into the gene content of genomes it is unable to distinguish protein-coding genes from putative non-coding RNA genes. This problem is further confounded by the fact that only 5% of the public protein sequence repository at UniProt/SwissProt has been curated and derived from actual protein evidence. This thesis contends that it is critically important to incorporate proteomics data into genome annotation pipelines to provide experimental protein-coding evidence. Although there have been major improvements in proteogenomics over the last decade there are still numerous challenges to overcome. These key challenges include the loss of sensitivity when using inflated search spaces of putative sequences, how best to interpret novel identifications and how best to control for false discoveries. This thesis addresses the existing gap between the use of genomic and proteomic sources for accurate genome annotation by applying a proteogenomics approach with a customised methodology. This new approach was applied within four case studies: a prokaryote bacterium; a monocotyledonous wheat plant; a dicotyledonous grape plant; and human. The key contributions of this thesis are: a new methodology for proteogenomics analysis; 145 suggested gene refinements in Bradyrhizobium diazoefficiens (nitrogen-fixing bacteria); 55 new gene predictions (57 protein isoforms) in Vitis vinifera (grape); 49 new gene predictions (52 protein isoforms) in Homo sapiens (human); and 67 new gene predictions (70 protein isoforms) in Triticum aestivum (bread wheat). Lastly, a number of possible improvements for the studies conducted in this thesis and proteogenomics as a whole have been identified and discussed.

[1]  P. Edman,et al.  A method for the determination of amino acid sequence in peptides. , 1949, Archives of biochemistry.

[2]  F. Crick On protein synthesis. , 1958, Symposia of the Society for Experimental Biology.

[3]  An Analysis of the Caries Process by Finite Absorbing Markov Chains , 1966, Journal of dental research.

[4]  F. Crick Central Dogma of Molecular Biology , 1970, Nature.

[5]  P. O’Farrell High resolution two-dimensional electrophoresis of proteins. , 1975, The Journal of biological chemistry.

[6]  B. Weimann,et al.  Computer-aided identification of compounds by comparison of mass spectra , 1984 .

[7]  P. Roepstorff,et al.  Proposal for a common nomenclature for sequence ions in mass spectra of peptides. , 1984, Biomedical mass spectrometry.

[8]  P. Sharp,et al.  Recognition of cap structure in splicing in vitro of mRNA precursors , 1984, Cell.

[9]  K. Mullis,et al.  Enzymatic amplification of beta-globin genomic sequences and restriction site analysis for diagnosis of sickle cell anemia. , 1985, Science.

[10]  Y. Shimura,et al.  Preferential excision of the 5' proximal intron from mRNA precursors with two introns as mediated by the cap structure. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[11]  K. Biemann Contributions of mass spectrometry to peptide and protein structure. , 1988, Biomedical & environmental mass spectrometry.

[12]  R. Mcdonald,et al.  JCAMP-DX: A Standard Form for Exchange of Infrared Spectra in Computer Readable Form , 1988 .

[13]  Stephen A. Martin,et al.  Collision-induced fragmentation of (M + H)+ ions of peptides. Side chain specific sequence ions , 1988 .

[14]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[15]  C. G. Edmonds,et al.  Primary sequence information from intact proteins by electrospray ionization tandem mass spectrometry. , 1990, Science.

[16]  A. Sachs The role of poly(A) in the translation and stability of mRNA. , 1990, Current opinion in cell biology.

[17]  Temple F. Smith,et al.  Prediction of gene structure. , 1992, Journal of molecular biology.

[18]  A. Denman Cellular and Molecular Immunology , 1992 .

[19]  Mark Borodovsky,et al.  GENMARK: Parallel Gene Recognition for Both DNA Strands , 1993, Comput. Chem..

[20]  P. Højrup,et al.  Rapid identification of proteins by peptide-mass fingerprinting , 1993, Current Biology.

[21]  D. Scott,et al.  Optimization and testing of mass spectral library search algorithms for compound identification , 1994, Journal of the American Society for Mass Spectrometry.

[22]  M. Wilm,et al.  Error-tolerant identification of peptides in sequence databases by peptide sequence tags. , 1994, Analytical chemistry.

[23]  Victor V. Solovyev,et al.  The Prediction of Human Exons By Oligonucleotide Composition and Disriminant Analysis of Spliceable Open Reading Frames , 1994, ISMB.

[24]  K. Nishikawa,et al.  Constructing a protein mutant database. , 1994, Protein engineering.

[25]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[26]  J. Yates,et al.  Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. , 1995, Analytical chemistry.

[27]  R. Fleischmann,et al.  The Minimal Gene Complement of Mycoplasma genitalium , 1995, Science.

[28]  A. Burlingame,et al.  Rapid mass spectrometric peptide sequencing and mass matching for characterization of human melanoma proteins isolated by two-dimensional PAGE. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[29]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[30]  K. Schleifer,et al.  Phylogenetic identification and in situ detection of individual microbial cells without cultivation. , 1995, Microbiological reviews.

[31]  J. Yates,et al.  Mining genomes: correlating tandem mass spectra of modified and unmodified peptides to sequences in nucleotide databases. , 1995, Analytical chemistry.

[32]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[33]  R. Aebersold,et al.  Protein identification by solid phase microextraction—capillary zone electrophoresis—microelectrospray—tandem mass spectrometry , 1996, Nature Biotechnology.

[34]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its new supplement TREMBL , 1996, Nucleic Acids Res..

[35]  P. Pevzner,et al.  Gene recognition via spliced sequence alignment. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[36]  D. Branton,et al.  Characterization of individual polynucleotide molecules using a membrane channel. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Peter G. Korning,et al.  Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information. , 1996, Nucleic acids research.

[38]  G. Carmichael,et al.  Role of polyadenylation in nucleocytoplasmic transport of mRNA , 1996, Molecular and cellular biology.

[39]  Ewan Birney,et al.  Dynamite: A Flexible Code Generating Language for Dynamic Programming Methods Used in Sequence Comparison , 1997, ISMB.

[40]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[41]  J. Yates,et al.  Identifying the major proteome components of Haemophilus influenzae type‐strain NCTC 8143 , 1997, Electrophoresis.

[42]  R. Anderegg,et al.  Two-dimensional SEC/RPLC coupled to mass spectrometry for the analysis of peptides. , 1997, Analytical chemistry.

[43]  Richard Mott,et al.  EST_GENOME: a program to align spliced DNA sequences to unspliced genomic DNA , 1997, Comput. Appl. Biosci..

[44]  J. A. Taylor,et al.  Sequence database searches via de novo peptide sequencing by tandem mass spectrometry. , 1997, Rapid communications in mass spectrometry : RCM.

[45]  N. Gray,et al.  mRNA stabilization by poly(A) binding protein is independent of poly(A) and requires translation. , 1998, Genes & development.

[46]  Zhongqi Zhang,et al.  A universal algorithm for fast and automated charge state deconvolution of electrospray mass-to-charge ratio spectra , 1998, Journal of the American Society for Mass Spectrometry.

[47]  G. Rubin,et al.  A computer program for aligning a cDNA sequence with a genomic DNA sequence. , 1998, Genome research.

[48]  G. Opiteck,et al.  Comprehensive two-dimensional high-performance liquid chromatography for the isolation of overexpressed proteins and proteome mapping. , 1998, Analytical biochemistry.

[49]  J. Yates,et al.  Method to compare collision-induced dissociation spectra of peptides: potential for library searching and subtractive analysis. , 1998, Analytical chemistry.

[50]  M. Borodovsky,et al.  GeneMark.hmm: new solutions for gene finding. , 1998, Nucleic acids research.

[51]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[52]  Peter R. Baker,et al.  Role of accurate mass measurement (+/- 10 ppm) in protein identification strategies employing MS or MS/MS and database searching. , 1999, Analytical chemistry.

[53]  Pavel A. Pevzner,et al.  De Novo Peptide Sequencing via Tandem Mass Spectrometry , 1999, J. Comput. Biol..

[54]  M. Sugiura,et al.  The chloroplast infA gene with a functional UUG initiation codon , 1999, FEBS letters.

[55]  S. Pääbo,et al.  Complete DNA sequence of the mitochondrial genome of the ascidian Halocynthia roretzi (Chordata, Urochordata). , 1999, Genetics.

[56]  J. Yates,et al.  Identification of proteins in complexes by solid-phase microextraction/multistep elution/capillary electrophoresis/tandem mass spectrometry. , 1999, Analytical chemistry.

[57]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[58]  C. V. Jongeneel,et al.  ESTScan: A Program for Detecting, Evaluating, and Reconstructing Potential Coding Regions in EST Sequences , 1999, ISMB.

[59]  T. Gojobori,et al.  Identification of a ribonuclease H gene in both Mycoplasma genitalium and Mycoplasma pneumoniae by a new method for exhaustive identification of ORFs in the complete genome sequences , 1999, FEBS letters.

[60]  E. Meyerowitz,et al.  Non-AUG Initiation of AGAMOUS mRNA Translation in Arabidopsis thaliana , 1999, Molecular and Cellular Biology.

[61]  M. Lean,et al.  Relationship among antioxidant activity, vasodilation capacity, and phenolic content of red wines. , 2000, Journal of agricultural and food chemistry.

[62]  Stephen M. Mount,et al.  The genome sequence of Drosophila melanogaster. , 2000, Science.

[63]  B. Chait,et al.  ProFound: an expert system for protein identification using mass spectrometric peptide mapping information. , 2000, Analytical chemistry.

[64]  Eugene W. Myers,et al.  A whole-genome assembly of Drosophila. , 2000, Science.

[65]  Joseph J. Pereira,et al.  Proteomic analysis of the human colon carcinoma cell line (LIM 1215): Development of a membrane protein database , 2000, Electrophoresis.

[66]  Wei Zhu,et al.  Optimal spliced alignment of homologous cDNA to a genomic DNA template , 2000, Bioinform..

[67]  D. Haussler,et al.  Genie--gene finding in Drosophila melanogaster. , 2000, Genome research.

[68]  F. McLafferty,et al.  Automated reduction and interpretation of , 2000, Journal of the American Society for Mass Spectrometry.

[69]  B. Chait,et al.  A statistical basis for testing the significance of mass spectrometric protein identification results. , 2000, Analytical chemistry.

[70]  V. Solovyev,et al.  Ab initio gene finding in Drosophila genomic DNA. , 2000, Genome research.

[71]  J. Choudhary,et al.  Interrogating the human genome using uninterpreted mass spectrometry data , 2001, Proteomics.

[72]  D. Church,et al.  Spidey: a tool for mRNA-to-genomic alignments. , 2001, Genome research.

[73]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[74]  B. Graveley Alternative splicing: increasing diversity in the proteomic world. , 2001, Trends in genetics : TIG.

[75]  Chris L. Tang,et al.  Efficiency of database search for identification of mutated and modified proteins via mass spectrometry. , 2001, Genome research.

[76]  P. Mortensen,et al.  Mass spectrometry allows direct identification of proteins in large genomes , 2001, Proteomics.

[77]  S. Salzberg,et al.  GeneSplicer: a new computational method for splice site prediction. , 2001, Nucleic acids research.

[78]  R. Wahl,et al.  Towards defining the urinary proteome using liquid chromatography‐tandem mass spectrometry I.Profiling an unfractionated tryptic digest , 2001, Proteomics.

[79]  Adam Godzik,et al.  Clustering of highly homologous sequences to reduce the size of large protein databases , 2001, Bioinform..

[80]  Matthew I. Bellgard,et al.  Applying artificial neural networks to the classification of wheat varieties processed via MALDI-TOF mass spectrometry , 2001 .

[81]  J. Yates,et al.  Large-scale analysis of the yeast proteome by multidimensional protein identification technology , 2001, Nature Biotechnology.

[82]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[83]  Ian Korf,et al.  Integrating genomic homology into gene structure prediction , 2001, ISMB.

[84]  J. Guhaniyogi,et al.  Regulation of mRNA stability in mammalian cells. , 2001, Gene.

[85]  I. Chernushevich,et al.  An introduction to quadrupole-time-of-flight mass spectrometry. , 2001, Journal of mass spectrometry : JMS.

[86]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[87]  Aaron J Mackey,et al.  Getting More from Less , 2002, Molecular & Cellular Proteomics.

[88]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[89]  R. Aebersold,et al.  Approaching complete peroxisome characterization by gas‐phase fractionation , 2002, Electrophoresis.

[90]  C. Kesmir,et al.  Variety identification of wheat using mass spectrometry with neural networks and the influence of mass spectra processing prior to neural network analysis. , 2002, Rapid communications in mass spectrometry : RCM.

[91]  J. Langridge,et al.  A novel precursor ion discovery method on a hybrid quadrupole orthogonal acceleration time-of-flight (Q-TOF) mass spectrometer for studying protein phosphorylation , 2002, Journal of the American Society for Mass Spectrometry.

[92]  Roger E. Moore,et al.  Qscore: An algorithm for evaluating SEQUEST database search results , 2002, Journal of the American Society for Mass Spectrometry.

[93]  Burkhard Morgenstern,et al.  AGenDA: Gene prediction by comparative sequence analysis , 2002, Silico Biol..

[94]  R. Durbin,et al.  GAZE: a generic framework for the integration of gene-prediction data by dynamic programming. , 2002, Genome research.

[95]  Alexey I Nesvizhskii,et al.  Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. , 2002, Analytical chemistry.

[96]  S. Tabata,et al.  Complete genomic sequence of nitrogen-fixing symbiotic bacterium Bradyrhizobium japonicum USDA110. , 2002, DNA research : an international journal for rapid publication of reports on genes and genomes.

[97]  N. Kelleher,et al.  Processing complex mixtures of intact proteins for direct analysis by mass spectrometry. , 2002, Analytical chemistry.

[98]  Jo McEntyre,et al.  The NCBI Handbook , 2002 .

[99]  E. Myers,et al.  Finishing a whole-genome shotgun: Release 3 of the Drosophila melanogaster euchromatic genome sequence , 2002, Genome Biology.

[100]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[101]  F. McLafferty,et al.  Top-down mass spectrometry of a 29-kDa protein for characterization of any posttranslational modification to within one residue , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[102]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2002, Nucleic Acids Res..

[103]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[104]  D. Goodlett,et al.  Shotgun collision‐induced dissociation of peptides using a time of flight mass analyzer , 2003, Proteomics.

[105]  M. Ferrari,et al.  Clinical proteomics: Written in blood , 2003, Nature.

[106]  J. Yates,et al.  GutenTag: high-throughput sequence tagging via an empirically derived fragmentation model. , 2003, Analytical chemistry.

[107]  R. Beavis,et al.  A method for reducing the time required to match protein sequences with tandem mass spectra. , 2003, Rapid communications in mass spectrometry : RCM.

[108]  Samuel I. Miller,et al.  Proteomic analysis of Pseudomonas aeruginosa grown under magnesium limitation , 2003, Journal of the American Society for Mass Spectrometry.

[109]  A. Shevchenko,et al.  MultiTag: multiple error-tolerant sequence tag search for the sequence-similarity identification of proteins by mass spectrometry. , 2003, Analytical chemistry.

[110]  R. Beavis,et al.  A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes. , 2003, Analytical chemistry.

[111]  R. Aebersold,et al.  Proteomics: the first decade and beyond , 2003, Nature Genetics.

[112]  D. DeVoe,et al.  Capillary isoelectric focusing-based multidimensional concentration/separation platform for proteome analysis. , 2003, Analytical chemistry.

[113]  R. Aebersold,et al.  A statistical model for identifying proteins by tandem mass spectrometry. , 2003, Analytical chemistry.

[114]  Ming Li,et al.  PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. , 2003, Rapid communications in mass spectrometry : RCM.

[115]  J. Yates,et al.  Similarity among tandem mass spectra from proteomic experiments: detection, significance, and utility. , 2003, Analytical chemistry.

[116]  Mario Stanke,et al.  Gene prediction with a hidden Markov model and a new intron submodel , 2003, ECCB.

[117]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[118]  F. Collins,et al.  The Human Genome Project: Lessons from Large-Scale Biology , 2003, Science.

[119]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[120]  J. Yates,et al.  A hypergeometric probability model for protein identification and validation using tandem mass spectral data and protein sequence databases. , 2003, Analytical chemistry.

[121]  J. Yates,et al.  Statistical characterization of ion trap tandem mass spectra from doubly charged tryptic peptides. , 2003, Analytical chemistry.

[122]  Joshua E. Elias,et al.  Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome. , 2003, Journal of proteome research.

[123]  Stephen M. Mount,et al.  Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. , 2003, Nucleic acids research.

[124]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[125]  P. Shannon,et al.  Proteomic analysis of human prostasomes , 2003, The Prostate.

[126]  Ian Korf,et al.  Gene finding in novel genomes , 2004, BMC Bioinformatics.

[127]  J. Shabanowitz,et al.  Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[128]  R. Durbin,et al.  GeneWise and Genomewise. , 2004, Genome research.

[129]  M. Mann,et al.  Trypsin Cleaves Exclusively C-terminal to Arginine and Lysine Residues*S , 2004, Molecular & Cellular Proteomics.

[130]  S. Bryant,et al.  Open mass spectrometry search algorithm. , 2004, Journal of proteome research.

[131]  Chris F. Taylor,et al.  A common open representation of mass spectrometry data and its application to proteomics research , 2004, Nature Biotechnology.

[132]  Robertson Craig,et al.  TANDEM: matching proteins with tandem mass spectra. , 2004, Bioinformatics.

[133]  T. Andrews,et al.  The Ensembl automatic gene annotation system. , 2004, Genome research.

[134]  T. Rejtar,et al.  Increased identification of peptides by enhanced data processing of high-resolution MALDI TOF/TOF mass spectra prior to database searching. , 2004, Analytical chemistry.

[135]  C. Schaefer,et al.  Proteomic analysis of detergent‐resistant membrane rafts , 2004, Electrophoresis.

[136]  B. Lynn,et al.  Factors that affect ion trap data-dependent MS/MS in proteomics , 2004, Journal of the American Society for Mass Spectrometry.

[137]  Yong-Bin Kim,et al.  ProSight PTM: an integrated environment for protein identification and characterization by top-down mass spectrometry , 2004, Nucleic Acids Res..

[138]  T. Veenstra,et al.  Proteomic investigation of natural killer cell microsomes using gas-phase fractionation by mass spectrometry. , 2004, Biochimica et biophysica acta.

[139]  R. Myers,et al.  Quality assessment of the human genome sequence , 2004, Nature.

[140]  Ruedi Aebersold,et al.  The Need for Guidelines in Publication of Peptide and Protein Identification Data , 2004, Molecular & Cellular Proteomics.

[141]  Jacob D. Jaffe,et al.  Proteogenomic mapping as a complementary method to perform genome annotation , 2004, Proteomics.

[142]  M. Mann,et al.  The abc's (and xyz's) of peptide sequencing , 2004, Nature Reviews Molecular Cell Biology.

[143]  Ewan Birney,et al.  Automated generation of heuristics for biological sequence comparison , 2005, BMC Bioinformatics.

[144]  T. Meinnel,et al.  Protein N-terminal methionine excision , 2004, Cellular and Molecular Life Sciences CMLS.

[145]  S. Searle,et al.  The Ensembl analysis pipeline. , 2004, Genome research.

[146]  Jennifer M. Schopf,et al.  PBS Pro: Grid computing and scheduling attributes , 2004 .

[147]  Lisa M. D'Souza,et al.  Genome sequence of the Brown Norway rat yields insights into mammalian evolution , 2004, Nature.

[148]  L. McDonnell,et al.  A mini-review of mass spectrometry using high-performance FTICR-MS methods , 2004, Analytical and bioanalytical chemistry.

[149]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[150]  Burkhard Morgenstern,et al.  AUGUSTUS: a web server for gene finding in eukaryotes , 2004, Nucleic Acids Res..

[151]  John D. Venable,et al.  MS1, MS2, and SQT-three unified, compact, and easily parsed file formats for the storage of shotgun proteomic spectra and identifications. , 2004, Rapid communications in mass spectrometry : RCM.

[152]  Marshall W. Bern,et al.  Automatic Quality Assessment of Peptide Tandem Mass Spectra , 2004, ISMB/ECCB.

[153]  Ilan Beer,et al.  Improving large‐scale proteomics by clustering of mass spectrometry data , 2004, Proteomics.

[154]  Nichole L. King,et al.  Integration with the human genome of peptide sequences obtained by high-throughput mass spectrometry , 2004, Genome Biology.

[155]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[156]  H. Meyer,et al.  Bioinformatics in proteomics. , 2004, Current pharmaceutical biotechnology.

[157]  John D. Venable,et al.  Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra , 2004, Nature Methods.

[158]  B. Searle,et al.  High-throughput identification of proteins and unanticipated sequence modifications using a mass-based alignment algorithm for MS/MS de novo sequencing results. , 2004, Analytical chemistry.

[159]  D. Panagiotakos,et al.  Can a Mediterranean diet moderate the development and clinical progression of coronary heart disease? A systematic review. , 2004, Medical science monitor : international medical journal of experimental and clinical research.

[160]  E. Stauber,et al.  A new approach that allows identification of intron‐split peptides from mass spectrometric data in genomic databases , 2004, FEBS letters.

[161]  Gajendra P.S. Raghava,et al.  EGPred: prediction of eukaryotic genes using ab initio methods after combining with sequence similarity approaches. , 2004, Genome research.

[162]  N. Kelleher,et al.  Molecular-level description of proteins from saccharomyces cerevisiae using quadrupole FT hybrid mass spectrometry for top down proteomics. , 2004, Analytical chemistry.

[163]  A. Gorin,et al.  PPM-chain - de novo peptide identification program comparable in performance to Sequest , 2004 .

[164]  E. Birney,et al.  The International Protein Index: An integrated database for proteomics experiments , 2004, Proteomics.

[165]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[166]  B. Cargile,et al.  Gel based isoelectric focusing of peptides and the utility of isoelectric point in protein identification. , 2004, Journal of proteome research.

[167]  R. Aebersold,et al.  Analysis, statistical validation and dissemination of large-scale proteomics datasets generated by tandem MS. , 2004, Drug discovery today.

[168]  Sean V. Taylor,et al.  The biosynthesis of the thiazole phosphate moiety of thiamin (vitamin B1): the early steps catalyzed by thiazole synthase. , 2004, Journal of the American Chemical Society.

[169]  Bertram Ludäscher,et al.  Kepler: an extensible system for design and execution of scientific workflows , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[170]  D. J. Douglas,et al.  Linear ion traps in mass spectrometry. , 2005, Mass spectrometry reviews.

[171]  A. Makarov,et al.  The Orbitrap: a new mass spectrometer. , 2005, Journal of mass spectrometry : JMS.

[172]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[173]  Gilbert S Omenn,et al.  An evaluation, comparison, and accurate benchmarking of several publicly available MS/MS search algorithms: Sensitivity and specificity analysis , 2005, Proteomics.

[174]  C. L. Chepanoske,et al.  Average peptide score: a useful parameter for identification of proteins derived from database searches of liquid chromatography/tandem mass spectrometry data. , 2005, Rapid communications in mass spectrometry : RCM.

[175]  M. Mann,et al.  Status of complete proteome analysis by mass spectrometry: SILAC labeled yeast as a model system , 2006, Genome Biology.

[176]  Richard D. LeDuc,et al.  New and automated MSn approaches for top-down identification of modified proteins , 2005, Journal of the American Society for Mass Spectrometry.

[177]  Steven P Gygi,et al.  Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations , 2005, Nature Methods.

[178]  Thomas D. Wu,et al.  GMAP: a genomic mapping and alignment program for mRNA and EST sequence , 2005, Bioinform..

[179]  Burkhard Morgenstern,et al.  Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources , 2006, BMC Bioinformatics.

[180]  P. Pevzner,et al.  PepNovo: de novo peptide sequencing via probabilistic network modeling. , 2005, Analytical chemistry.

[181]  Tatiana A. Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[182]  Dekel Tsur,et al.  Identification of post-translational modifications via blind search of mass-spectra , 2005, 2005 IEEE Computational Systems Bioinformatics Conference (CSB'05).

[183]  Maciek Sasinowski,et al.  What is mzXML good for? , 2005, Expert review of proteomics.

[184]  R. Aebersold,et al.  ProbIDtree: An automated software program capable of identifying multiple peptides from a single collision‐induced dissociation spectrum collected by a tandem mass spectrometer , 2005, Proteomics.

[185]  Steven Salzberg,et al.  JIGSAW: integration of multiple sources of evidence for gene prediction , 2005, Bioinform..

[186]  W. McDonald,et al.  MS2Grouper: Group assessment and synthetic replacement of duplicate proteomic tandem mass spectra , 2005, Journal of the American Society for Mass Spectrometry.

[187]  Christian von Mering,et al.  STRING: known and predicted protein–protein associations, integrated and transferred across organisms , 2004, Nucleic Acids Res..

[188]  Lennart Martens,et al.  PRIDE: The proteomics identifications database , 2005, Proteomics.

[189]  Sanghyuk Lee,et al.  ECgene: genome annotation for alternative splicing , 2004, Nucleic Acids Res..

[190]  M. Espagnol,et al.  Translation initiation by non-AUG codons in Arabidopsis thaliana transgenic plants , 2006, Plant Cell Reports.

[191]  A. Coulson,et al.  Genomics in C. elegans: so many genes, such a little worm. , 2005, Genome research.

[192]  Gertraud Burger,et al.  AutoFACT: An Automatic Functional Annotation and Classification Tool , 2005, BMC Bioinformatics.

[193]  Alexey I Nesvizhskii,et al.  Interpretation of Shotgun Proteomic Data , 2005, Molecular & Cellular Proteomics.

[194]  Chris F. Taylor,et al.  Data management and preliminary data analysis in the pilot phase of the HUPO Plasma Proteome Project , 2005, Proteomics.

[195]  P. Pevzner,et al.  InsPecT: identification of posttranslationally modified peptides from tandem mass spectra. , 2005, Analytical chemistry.

[196]  Yi-Kuo Yu,et al.  Robust accurate identification of peptides (RAId): deciphering MS2 data using a structured library search with de novo based statistics , 2005, Bioinform..

[197]  Christian G Elowsky,et al.  Dual-Domain, Dual-Targeting Organellar Protein Presequences in Arabidopsis Can Use Non-AUG Start Codons , 2005, The Plant Cell Online.

[198]  C. Bessant,et al.  Confident protein identification using the average peptide score method coupled with search-specific, ab initio thresholds. , 2005, Rapid communications in mass spectrometry : RCM.

[199]  James G. R. Gilbert,et al.  The vertebrate genome annotation (Vega) database , 2004, Nucleic Acids Res..

[200]  John D. Storey,et al.  Multiple Locus Linkage Analysis of Genomewide Expression in Yeast , 2005, PLoS biology.

[201]  Anders Krogh,et al.  Large-scale prokaryotic gene prediction and comparison to genome annotation , 2005, Bioinform..

[202]  M. Savitski,et al.  Proteomics-grade de novo sequencing approach. , 2005, Journal of proteome research.

[203]  J. Claverie Fewer Genes, More Noncoding RNA , 2005, Science.

[204]  Daniel J. Blankenberg,et al.  Galaxy: a platform for interactive large-scale genome analysis. , 2005, Genome research.

[205]  R. Aebersold,et al.  Scoring proteomes with proteotypic peptide probes , 2005, Nature Reviews Molecular Cell Biology.

[206]  William E. Allcock,et al.  The Globus Striped GridFTP Framework and Server , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[207]  Nichole L. King,et al.  The PeptideAtlas Project , 2010, Proteome Bioinformatics.

[208]  William Stafford Noble,et al.  Analysis of peptide MS/MS spectra from large-scale proteomics experiments using spectrum libraries. , 2006, Analytical chemistry.

[209]  Henning Hermjakob,et al.  The HUPO Proteomics Standards Initiative – Overcoming the Fragmentation of Proteomics Data , 2006, Proteomics.

[210]  Daniel P. Miranker,et al.  A fast coarse filtering method for peptide identification by mass spectrometry , 2006, Bioinform..

[211]  Carole A. Goble,et al.  Taverna: a tool for building and running workflows of services , 2006, Nucleic Acids Res..

[212]  F. McCarthy,et al.  Modeling a whole organ using proteomics: The avian bursa of Fabricius , 2006, Proteomics.

[213]  R. Aebersold,et al.  Mass Spectrometry and Protein Analysis , 2006, Science.

[214]  Knut Reinert,et al.  High-Accuracy Peak Picking of Proteomics Data Using Wavelet Techniques , 2005, Pacific Symposium on Biocomputing.

[215]  Tatu Ylönen,et al.  The Secure Shell (SSH) Authentication Protocol , 2006, RFC.

[216]  Wen Gao,et al.  IndexToolkit: an open source toolbox to index protein databases for high-throughput proteomics , 2006, Bioinform..

[217]  Garrick Staples,et al.  TORQUE resource manager , 2006, SC.

[218]  Lennart Martens,et al.  PRIDE: a public repository of protein and peptide identifications for the proteomics community , 2005, Nucleic Acids Res..

[219]  J. Harrow,et al.  GENCODE: producing a reference annotation for ENCODE , 2006, Genome Biology.

[220]  J. Oliver,et al.  A relationship between GC content and coding-sequence length , 1996, Journal of Molecular Evolution.

[221]  M. Gorenstein,et al.  Absolute Quantification of Proteins by LCMSE , 2006, Molecular & Cellular Proteomics.

[222]  F. McLafferty,et al.  Extending Top-Down Mass Spectrometry to Proteins with Masses Greater Than 200 Kilodaltons , 2006, Science.

[223]  C. Bessant,et al.  GAPP: a fully automated software for the confident identification of human peptides from tandem mass spectra. , 2006, Journal of proteome research.

[224]  Linfeng Wu,et al.  Overcoming the dynamic range problem in mass spectrometry-based shotgun proteomics , 2006, Expert review of proteomics.

[225]  R. Beavis,et al.  Using annotated peptide mass spectrum libraries for protein identification. , 2006, Journal of proteome research.

[226]  James P. Reilly,et al.  A computational approach toward label-free protein quantification using predicted peptide detectability , 2006, ISMB.

[227]  R. Sachidanandam,et al.  Comprehensive splice-site analysis using comparative genomics , 2006, Nucleic acids research.

[228]  B. Shen,et al.  The bifunctional glyceryl transferase/phosphatase OzmB belonging to the HAD superfamily that diverts 1,3-bisphosphoglycerate into polyketide biosynthesis. , 2006, Journal of the American Chemical Society.

[229]  M. Gorenstein,et al.  Simultaneous Qualitative and Quantitative Analysis of theEscherichia coli Proteome , 2006, Molecular & Cellular Proteomics.

[230]  T. Mitchell-Olds,et al.  Comparative genomics as a tool for gene discovery. , 2006, Current opinion in biotechnology.

[231]  T. Köcher,et al.  A general precursor ion‐like scanning mode on quadrupole‐TOF instruments compatible with chromatographic separation , 2006, Proteomics.

[232]  S. Hanash,et al.  Challenges in deriving high-confidence protein identifications from data gathered by a HUPO plasma proteome collaborative study , 2006, Nature Biotechnology.

[233]  G. Weinstock,et al.  Creating a honey bee consensus gene set , 2007, Genome Biology.

[234]  Aaron A. Klammer,et al.  Effects of modified digestion schemes on the identification of proteins from complex mixtures. , 2006, Journal of proteome research.

[235]  Markus Müller,et al.  Automated protein identification by tandem mass spectrometry: issues and strategies. , 2006, Mass spectrometry reviews.

[236]  Serafim Batzoglou,et al.  CONTRAfold: RNA secondary structure prediction without physics-based models , 2006, ISMB.

[237]  B. Ueberheide,et al.  The utility of ETD mass spectrometry in proteomic analysis. , 2006, Biochimica et biophysica acta.

[238]  N. Kelleher,et al.  Top Down Mass Spectrometry of <60-kDa Proteins from Methanosarcina acetivorans Using Quadrupole FTMS with Automated Octopole Collisionally Activated Dissociation*S , 2006, Molecular & Cellular Proteomics.

[239]  R. Aebersold,et al.  Dynamic Spectrum Quality Assessment and Iterative Computational Analysis of Shotgun Proteomic Data , 2006, Molecular & Cellular Proteomics.

[240]  Brendan K Faherty,et al.  Optimization and Use of Peptide Mass Measurement Accuracy in Shotgun Proteomics*S , 2006, Molecular & Cellular Proteomics.

[241]  I. Eidhammer,et al.  Improving the reliability and throughput of mass spectrometry‐based proteomics by spectrum quality filtering , 2006, Proteomics.

[242]  Fang-Xiang Wu,et al.  Quality Assessment of Peptide Tandem Mass Spectra , 2006, IMSCCS.

[243]  Peicheng Du,et al.  Automatic deconvolution of isotope-resolved mass spectra using variable selection and quantized peptide mass distribution. , 2006, Analytical chemistry.

[244]  F. Halgand,et al.  Top-down mass spectrometry of integral membrane proteins , 2006, Expert review of proteomics.

[245]  Inna Dubchak,et al.  The integrated microbial genomes (IMG) system , 2005, Nucleic Acids Res..

[246]  N. Kelleher,et al.  Top-down proteomics on a chromatographic time scale using linear ion trap fourier transform hybrid mass spectrometers. , 2007, Analytical chemistry.

[247]  James P. Reilly,et al.  Advancement in Protein Inference from Shotgun Proteomics Using Peptide Detectability , 2006, Pacific Symposium on Biocomputing.

[248]  William Stafford Noble,et al.  Semi-supervised learning for peptide identification from shotgun proteomics datasets , 2007, Nature Methods.

[249]  Joachim M. Buhmann,et al.  PepSplice: cache-efficient search algorithms for comprehensive identification of tandem mass spectra , 2007, Bioinform..

[250]  Richard Durbin,et al.  Genomix: a method for combining gene-finders' predictions, which uses evolutionary conservation of sequence and intron-exon structure , 2007, Bioinform..

[251]  Dustin A. Cartwright,et al.  A High Quality Draft Consensus Sequence of the Genome of a Heterozygous Grapevine Variety , 2007, PloS one.

[252]  G. McAlister,et al.  Performance Characteristics of Electron Transfer Dissociation Mass Spectrometry*S , 2007, Molecular & Cellular Proteomics.

[253]  David Goldberg,et al.  Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry. , 2007, Analytical chemistry.

[254]  M. Gelfand,et al.  Comparative Genomics and Evolution of Alternative Splicing: The Pessimists′ Science , 2007 .

[255]  Alexey I Nesvizhskii,et al.  Protein identification by tandem mass spectrometry and sequence database searching. , 2007, Methods in molecular biology.

[256]  R. Guigó,et al.  Improving gene annotation using peptide mass spectrometry. , 2007, Genome research.

[257]  M. Mann,et al.  Higher-energy C-trap dissociation for peptide modification analysis , 2007, Nature Methods.

[258]  D. C. Simpson,et al.  Proteomic profiling of intact proteins using WAX-RPLC 2-D separations and FTICR mass spectrometry. , 2007, Journal of proteome research.

[259]  M. Gerstein,et al.  What is a gene, post-ENCODE? History and updated definition. , 2007, Genome research.

[260]  Sean L Seymour,et al.  The Paragon Algorithm, a Next Generation Search Engine That Uses Sequence Temperature Values and Feature Probabilities to Identify Peptides from Tandem Mass Spectra*S , 2007, Molecular & Cellular Proteomics.

[261]  Rick L. Stevens,et al.  The RAST Server: Rapid Annotations using Subsystems Technology , 2008, BMC Genomics.

[262]  Koby Crammer,et al.  Global Discriminative Learning for Higher-Accuracy Computational Gene Prediction , 2007, PLoS Comput. Biol..

[263]  J. Galagan,et al.  Conrad: gene prediction using conditional random fields. , 2007, Genome research.

[264]  Steven P Gygi,et al.  Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry , 2007, Nature Methods.

[265]  D. Tabb,et al.  Proteomic parsimony through bipartite graph analysis improves accuracy and transparency. , 2007, Journal of proteome research.

[266]  Namshin Kim,et al.  The ASAP II database: analysis and comparative genomics of alternative splicing in 15 animal species , 2006, Nucleic Acids Res..

[267]  F. Halgand,et al.  Protein-Sequence Polymorphisms and Post-translational Modifications in Proteins from Human Saliva using Top-Down Fourier-transform Ion Cyclotron Resonance Mass Spectrometry. , 2007, International journal of mass spectrometry.

[268]  Tom S. Price,et al.  EBP, a Program for Protein Identification Using Multiple Tandem Mass Spectrometry Datasets*S , 2007, Molecular & Cellular Proteomics.

[269]  Jonathan E. Allen,et al.  Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments , 2007, Genome Biology.

[270]  J. Whitelegge,et al.  Increased coverage in the transmembrane domain with activated-ion electron capture dissociation for top-down Fourier-transform mass spectrometry of integral membrane proteins. , 2007, Journal of proteome research.

[271]  Nichole L. King,et al.  Development and validation of a spectral library searching method for peptide identification from MS/MS , 2007, Proteomics.

[272]  Patrick G. A. Pedrioli,et al.  A high-quality catalog of the Drosophila melanogaster proteome , 2007, Nature Biotechnology.

[273]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[274]  D. Oesterhelt,et al.  Large-scale identification of N-terminal peptides in the halophilic archaea Halobacterium salinarum and Natronomonas pharaonis. , 2007, Journal of proteome research.

[275]  Debojyoti Dutta,et al.  MSNovo: a dynamic programming algorithm for de novo peptide sequencing via tandem mass spectrometry. , 2007, Analytical chemistry.

[276]  Henning Hermjakob,et al.  Five years of progress in the Standardization of Proteomics Data 4th Annual Spring Workshop of the HUPO‐Proteomics Standards Initiative April 23–25, 2007 Ecole Nationale Supérieure (ENS), Lyon, France , 2007, Proteomics.

[277]  J. Poulain,et al.  The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla , 2007, Nature.

[278]  Mark Gerstein,et al.  Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation , 2006, Nucleic Acids Res..

[279]  M. Mann,et al.  On the Proper Use of Mass Accuracy in Proteomics* , 2007, Molecular & Cellular Proteomics.

[280]  Steven Salzberg,et al.  Identifying bacterial genes and endosymbiont DNA with Glimmer , 2007, Bioinform..

[281]  G. McAlister,et al.  Supplemental activation method for high-efficiency electron-transfer dissociation of doubly protonated peptide precursors. , 2007, Analytical chemistry.

[282]  Richard D. Smith,et al.  Whole proteome analysis of post-translational modifications: applications of mass-spectrometry for proteogenomic annotation. , 2007, Genome research.

[283]  Knut Reinert,et al.  OpenMS – An open-source software framework for mass spectrometry , 2008, BMC Bioinformatics.

[284]  M. Brent How does eukaryotic gene prediction work? , 2007, Nature Biotechnology.

[285]  Ting Chen,et al.  Speeding up tandem mass spectrometry database search: metric embeddings and fast near neighbor search , 2007, Bioinform..

[286]  Alexey I Nesvizhskii,et al.  Analysis and validation of proteomic data generated by tandem mass spectrometry , 2007, Nature Methods.

[287]  N. Edwards,et al.  Novel peptide identification from tandem mass spectra using ESTs and sequence database compression , 2007, Molecular systems biology.

[288]  G. Damonte,et al.  How to discriminate between leucine and isoleucine by low energy ESI-TRAP MSn , 2007, Journal of the American Society for Mass Spectrometry.

[289]  Daniel B. Martin,et al.  Computational prediction of proteotypic peptides for quantitative proteomics , 2007, Nature Biotechnology.

[290]  D. Lauffenburger,et al.  Multiple reaction monitoring for robust quantitative proteomic analysis of cellular signaling networks , 2007, Proceedings of the National Academy of Sciences.

[291]  F. McLafferty,et al.  Top‐down MS, a powerful complement to the high capabilities of proteolysis proteomics , 2007, The FEBS journal.

[292]  Chuong B. Do,et al.  CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction , 2007, Genome Biology.

[293]  Yong-Bin Kim,et al.  ProSight PTM 2.0: improved protein identification and characterization for top down mass spectrometry , 2007, Nucleic Acids Res..

[294]  N. Kelleher,et al.  Decoding protein modifications using top-down mass spectrometry , 2007, Nature Methods.

[295]  Yi-Kuo Yu,et al.  Statistical Characterization of a 1D Random Potential Problem - with applications in score statistics of MS-based peptide sequencing. , 2008, Physica A.

[296]  A. Pandey,et al.  Comprehensive Comparison of Collision Induced Dissociation and Electron Transfer Dissociation , 2008, Analytical chemistry.

[297]  G. McAlister,et al.  Decision tree–driven tandem mass spectrometry for shotgun proteomics , 2008, Nature Methods.

[298]  E. Deutsch mzML: A single, unifying data format for mass spectrometer output , 2008, Proteomics.

[299]  William Stafford Noble,et al.  Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. , 2008, Journal of proteome research.

[300]  P. Bork,et al.  Large gene overlaps in prokaryotic genomes: result of functional constraints or mispredictions? , 2008, BMC Genomics.

[301]  P. Pevzner,et al.  Interpreting top-down mass spectra using spectral alignment. , 2008, Analytical chemistry.

[302]  P. Bork,et al.  Molecular eco-systems biology: towards an understanding of community function , 2008, Nature Reviews Microbiology.

[303]  A. Shevchenko,et al.  Protein identification pipeline for the homology-driven proteomics. , 2008, Journal of proteomics.

[304]  Samuel H. Payne,et al.  Discovery and revision of Arabidopsis genes by proteogenomics , 2008, Proceedings of the National Academy of Sciences.

[305]  P. Andrews,et al.  A spectral clustering approach to MS/MS identification of post-translational modifications. , 2008, Journal of proteome research.

[306]  William Stafford Noble,et al.  Posterior error probabilities and false discovery rates: two sides of the same coin. , 2008, Journal of proteome research.

[307]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[308]  M. Brent Steady progress and recent breakthroughs in the accuracy of automated genome annotation , 2008, Nature Reviews Genetics.

[309]  J. C. Tran,et al.  Gel-eluted liquid fraction entrapment electrophoresis: an electrophoretic method for broad molecular weight range proteome separation. , 2008, Analytical chemistry.

[310]  William Stafford Noble,et al.  Rapid and accurate peptide identification from tandem mass spectra. , 2008, Journal of proteome research.

[311]  Richard D. Smith,et al.  Does trypsin cut before proline? , 2008, Journal of proteome research.

[312]  Richard D. Smith,et al.  Proteogenomics: needs and roles to be filled by proteomics in genome annotation. , 2008, Briefings in functional genomics & proteomics.

[313]  Daniel B. Goodman,et al.  Comparative proteogenomics: combining mass spectrometry and comparative genomics to analyze multiple genomes. , 2008, Genome research.

[314]  Richard D. Smith,et al.  De novo sequencing of unique sequence tags for discovery of post-translational modifications of proteins. , 2008, Analytical chemistry.

[315]  Qian Liu,et al.  Evigan: a hidden variable model for integrating gene evidence for eukaryotic gene prediction , 2008, Bioinform..

[316]  M. Savitski,et al.  Electron capture/transfer versus collisionally activated/induced dissociations: Solo or duet? , 2008, Journal of the American Society for Mass Spectrometry.

[317]  B. V. Breukelen,et al.  Targeted SCX Based Peptide Fractionation for Optimal Sequencing by Collision Induced, and Electron Transfer Dissociation , 2008 .

[318]  A. Shevchenko,et al.  Separating the wheat from the chaff: unbiased filtering of background tandem mass spectra improves protein identification. , 2008, Journal of proteome research.

[319]  Richard D. Smith,et al.  Clustering millions of tandem mass spectra. , 2008, Journal of proteome research.

[320]  R. Aebersold,et al.  Selected reaction monitoring for quantitative proteomics: a tutorial , 2008, Molecular systems biology.

[321]  J. Garin,et al.  PepLine: a software pipeline for high-throughput direct mapping of tandem mass spectrometry data on genomic sequences. , 2008, Journal of proteome research.

[322]  D. Ghosh,et al.  Statistical validation of peptide identifications in large-scale proteomics using the target-decoy database search strategy and flexible mixture modeling. , 2008, Journal of proteome research.

[323]  P. Pevzner,et al.  Spectral probabilities and generating functions of tandem mass spectra: a strike against decoy databases. , 2008, Journal of proteome research.

[324]  C. Ling,et al.  PeakSelect: preprocessing tandem mass spectra for better peptide identification. , 2008, Rapid communications in mass spectrometry : RCM.

[325]  M. Tress,et al.  Proteomics studies confirm the presence of alternative protein isoforms on a large scale , 2008, Genome Biology.

[326]  N. Kelleher,et al.  "Proteotyping": population proteomics of human leukocytes using top down mass spectrometry. , 2008, Analytical chemistry.

[327]  Marshall W. Bern,et al.  Spectrum Fusion: Using Multiple Mass Spectra for De Novo Peptide Sequencing , 2008, RECOMB.

[328]  Robert A. Grothe,et al.  Precursor-ion mass re-estimation improves peptide identification on hybrid instruments. , 2008, Journal of proteome research.

[329]  A. Sickmann,et al.  Application of electron transfer dissociation (ETD) for the analysis of posttranslational modifications , 2008, Proteomics.

[330]  Ronald J. Moore,et al.  Mass spectrometry analysis of proteome-wide proteolytic post-translational degradation of proteins. , 2008, Analytical chemistry.

[331]  Morgan C. Giddings,et al.  Using GFS to Identify Encoding Genomic Loci from Protein Mass Spectral Data , 2008, Current protocols in bioinformatics.

[332]  D. Bartel MicroRNAs: Target Recognition and Regulatory Functions , 2009, Cell.

[333]  N. M. Karabacak,et al.  Sensitive and Specific Identification of Wild Type and Variant Proteins from 8 to 669 kDa Using Top-down Mass Spectrometry*S , 2009, Molecular & Cellular Proteomics.

[334]  Norman W. Paton,et al.  Improving sensitivity in proteome studies by analysis of false discovery rates for multiple search engines , 2009, Proteomics.

[335]  P. Pevzner,et al.  Spectral Profiles, a Novel Representation of Tandem Mass Spectra and Their Applications for de Novo Peptide Sequencing and Identification* , 2009, Molecular & Cellular Proteomics.

[336]  Steven J. M. Jones,et al.  Circos: an information aesthetic for comparative genomics. , 2009, Genome research.

[337]  Michael Travers,et al.  BioBIKE: A Web-based, programmable, integrated biological knowledge base , 2009, Nucleic Acids Res..

[338]  Samuel I. Miller,et al.  Precursor acquisition independent from ion count: how to dive deeper into the proteomics ocean. , 2009, Analytical chemistry.

[339]  Johnf . Thompson,et al.  Virtual Terminator nucleotides for next generation DNA sequencing , 2009, Nature Methods.

[340]  P. Demirev,et al.  Top-down identification of protein biomarkers in bacteria with unsequenced genomes. , 2009, Analytical chemistry.

[341]  Cheng Soon Ong,et al.  mGene: accurate SVM-based gene finding with an application to nematode genomes. , 2009, Genome research.

[342]  Ari M Frank,et al.  A ranking-based scoring function for peptide-spectrum matches. , 2009, Journal of proteome research.

[343]  F. Sobott,et al.  Comparison of CID versus ETD based MS/MS fragmentation for the analysis of protein ubiquitination , 2009, Journal of the American Society for Mass Spectrometry.

[344]  Michael A. Freitas,et al.  MassMatrix: A database search program for rapid characterization of proteins and peptides from tandem mass spectrometry data , 2009, Proteomics.

[345]  S. Mohammed,et al.  Improved identification of endogenous peptides from murine nervous tissue by multiplexed peptide extraction methods and multiplexed mass spectrometric analysis. , 2009, Journal of proteome research.

[346]  H. Rehrauer,et al.  Deterministic protein inference for shotgun proteomics data provides new insights into Arabidopsis pollen development and function. , 2009, Genome research.

[347]  Dan Golick,et al.  Database searching and accounting of multiplexed precursor and product ion spectra from the data independent analysis of simple and complex peptide mixtures , 2009, Proteomics.

[348]  T. Borodina,et al.  Transcriptome analysis by strand-specific sequencing of complementary DNA , 2009, Nucleic acids research.

[349]  E. Pennisi DNA sequencing. No genome left behind. , 2009, Science.

[350]  Andreas Quandt,et al.  SwissPIT: An workflow‐based platform for analyzing tandem‐MS spectra using the Grid , 2009, Proteomics.

[351]  D. Goodlett,et al.  Precursor ion independent algorithm for top-down shotgun proteomics , 2009, Journal of the American Society for Mass Spectrometry.

[352]  Michael D. Litton,et al.  IDPicker 2.0: Improved protein assembly with high discrimination peptide identification filtering. , 2009, Journal of proteome research.

[353]  P. Pevzner,et al.  Spectral Dictionaries , 2009, Molecular & Cellular Proteomics.

[354]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[355]  M. Mann,et al.  Universal sample preparation method for proteome analysis , 2009, Nature Methods.

[356]  Bing Zhang,et al.  Network-assisted protein identification and data interpretation in shotgun proteomics , 2009, Molecular systems biology.

[357]  O. Poch,et al.  Ortho-proteogenomics: multiple proteomes investigation through orthology and a new MS-based protocol. , 2008, Genome research.

[358]  P. Pevzner,et al.  False discovery rates of protein identifications: a strike against the two-peptide rule. , 2009, Journal of proteome research.

[359]  R. Aebersold,et al.  Applying mass spectrometry-based proteomics to genetics, genomics and network biology , 2009, Nature Reviews Genetics.

[360]  David Edwards,et al.  De novo sequencing of plant genomes using second-generation technologies , 2009, Briefings Bioinform..

[361]  Miriam L. Land,et al.  Trace: Tennessee Research and Creative Exchange Prodigal: Prokaryotic Gene Recognition and Translation Initiation Site Identification Recommended Citation Prodigal: Prokaryotic Gene Recognition and Translation Initiation Site Identification , 2022 .

[362]  Brian D Halligan,et al.  Low cost, scalable proteomics data analysis using Amazon's cloud computing services and open source search algorithms. , 2009, Journal of proteome research.

[363]  Samuel H. Payne,et al.  A proteogenomic update to Yersinia: enhancing genome annotation , 2010, BMC Genomics.

[364]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[365]  P. Pevzner,et al.  Deconvolution and Database Search of Complex Tandem Mass Spectra of Intact Proteins , 2010, Molecular & Cellular Proteomics.

[366]  Jennifer L. Harrow,et al.  Meeting report: a workshop on Best Practices in Genome Annotation , 2010, Database J. Biol. Databases Curation.

[367]  A. Nesvizhskii,et al.  Computational analysis of unassigned high‐quality MS/MS spectra in proteomic data sets , 2010, Proteomics.

[368]  Q. Zeng,et al.  Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop , 2010, Viruses.

[369]  Brandon M. Malone,et al.  The Proteogenomic Mapping Tool , 2011, BMC Bioinformatics.

[370]  N. Ahn,et al.  Quantifying the impact of chimera MS/MS spectra on peptide identification in large-scale proteomics studies. , 2010, Journal of proteome research.

[371]  R. Macknight,et al.  Noncanonical Translation Initiation of the Arabidopsis Flowering Time and Alternative Polyadenylation Regulator FCA[C][W] , 2010, Plant Cell.

[372]  V. Bafna,et al.  Proteogenomics to discover the full coding content of genomes: a computational perspective. , 2010, Journal of proteomics.

[373]  N. Kelleher,et al.  Size-sorting combined with improved nanocapillary liquid chromatography-mass spectrometry for identification of intact proteins up to 80 kDa. , 2010, Analytical chemistry.

[374]  Nuno Bandeira,et al.  Gapped Spectral Dictionaries and Their Applications for Database Searches of Tandem Mass Spectra* , 2010, Molecular & Cellular Proteomics.

[375]  Yan Fu,et al.  pNovo: de novo peptide sequencing and identification using HCD spectra. , 2010, Journal of proteome research.

[376]  B. Garcia What does the future hold for top down mass spectrometry? , 2010, Journal of the American Society for Mass Spectrometry.

[377]  C. Ahrens,et al.  PeptideClassifier for protein inference and targeted quantitative proteomics , 2010, Nature Biotechnology.

[378]  Yan Fu,et al.  Speeding up tandem mass spectrometry based database searching by peptide and spectrum indexing. , 2010, Rapid communications in mass spectrometry : RCM.

[379]  F. Dupont,et al.  Deciphering the complexities of the wheat flour proteome using quantitative two-dimensional electrophoresis, three proteases and tandem mass spectrometry , 2011, Proteome Science.

[380]  C. Forcato Gene prediction and functional annotation in the Vitis vinifera genome , 2010 .

[381]  Carole A. Goble,et al.  myExperiment: a repository and social network for the sharing of bioinformatics workflows , 2010, Nucleic Acids Res..

[382]  T. Nilsen,et al.  Expansion of the eukaryotic proteome by alternative splicing , 2010, Nature.

[383]  Bin Ma,et al.  Better score function for peptide identification with ETD MS/MS spectra , 2010, BMC Bioinformatics.

[384]  M. Goshe,et al.  Improving protein and proteome coverage through data-independent multiplexed peptide fragmentation. , 2010, Journal of proteome research.

[385]  Johnf . Thompson,et al.  Single Molecule Sequencing with a HeliScope Genetic Analysis System , 2010, Current protocols in molecular biology.

[386]  Kang Ning,et al.  The utility of mass spectrometry-based proteomic data for validation of novel alternative splice forms reconstructed from RNA-Seq data: a preliminary assessment , 2010, BMC Bioinformatics.

[387]  M. Mann,et al.  Proteomics on an Orbitrap Benchtop Mass Spectrometer Using All-ion Fragmentation , 2010, Molecular & Cellular Proteomics.

[388]  S. Turner,et al.  Real-time DNA sequencing from single polymerase molecules. , 2010, Methods in enzymology.

[389]  H. Rehrauer,et al.  Rhizobial adaptation to hosts, a new facet in the legume root-nodule symbiosis. , 2010, Molecular plant-microbe interactions : MPMI.

[390]  P. Pevzner,et al.  The Generating Function of CID, ETD, and CID/ETD Pairs of Tandem Mass Spectra: Applications to Database Search* , 2010, Molecular & Cellular Proteomics.

[391]  Fan Zhang,et al.  PEPPI: a peptidomic database of human protein isoforms for proteomics experiments , 2010, BMC Bioinformatics.

[392]  Serban Nacu,et al.  Fast and SNP-tolerant detection of complex variants and splicing in short reads , 2010, Bioinform..

[393]  R. Sommer,et al.  Proteogenomics of Pristionchus pacificus reveals distinct proteome structure of nematode models. , 2010, Genome research.

[394]  V. Bafna,et al.  Template Proteogenomics: Sequencing Whole Proteins Using an Imperfect Database* , 2010, Molecular & Cellular Proteomics.

[395]  G. Pessi,et al.  An integrated proteomics and transcriptomics reference data set provides new insights into the Bradyrhizobium japonicum bacteroid metabolism in soybean root nodules , 2010, Proteomics.

[396]  R. Aebersold,et al.  Generating and navigating proteome maps using mass spectrometry , 2010, Nature Reviews Molecular Cell Biology.

[397]  Patrick G. A. Pedrioli Trans-Proteomic Pipeline: A Pipeline for Proteomic Analysis , 2010, Proteome Bioinformatics.

[398]  P. Mallick,et al.  Peptide Identification from Mixture Tandem Mass Spectra* , 2010, Molecular & Cellular Proteomics.

[399]  Mihai Pop,et al.  Assessing the benefits of using mate-pairs to resolve repeats in de novo short-read prokaryotic assemblies , 2011, BMC Bioinformatics.

[400]  D. Tabb,et al.  TagRecon: high-throughput mutation identification through sequence tagging. , 2010, Journal of proteome research.

[401]  Lennart Martens,et al.  The Proteomics Identifications database: 2010 update , 2009, Nucleic Acids Res..

[402]  Gennifer E. Merrihew,et al.  Deconvolution of mixture spectra from ion-trap data-independent-acquisition tandem mass spectrometry. , 2010, Analytical chemistry.

[403]  M. Schatz,et al.  Assembly of large genomes using second-generation sequencing. , 2010, Genome research.

[404]  Joshua E. Elias,et al.  Target-Decoy Search Strategy for Mass Spectrometry-Based Proteomics , 2010, Proteome Bioinformatics.

[405]  T. Tatusova,et al.  Gnomon – NCBI eukaryotic gene prediction tool , 2010 .

[406]  Shimyn Slomovic,et al.  Addition of poly(A) and poly(A)-rich tails during RNA degradation in the cytoplasm of human cells , 2010, Proceedings of the National Academy of Sciences.

[407]  Nichollas E. Scott,et al.  Simultaneous Glycan-Peptide Characterization Using Hydrophilic Interaction Chromatography and Parallel Fragmentation by CID, Higher Energy Collisional Dissociation, and Electron Transfer Dissociation MS Applied to the N-Linked Glycoproteome of Campylobacter jejuni* , 2010, Molecular & Cellular Proteomics.

[408]  Bin Ma,et al.  Adepts: Advanced peptide de novo Sequencing with a Pair of Tandem Mass Spectra , 2010, J. Bioinform. Comput. Biol..

[409]  J. Gogarten,et al.  Using comparative genome analysis to identify problems in annotated microbial genomes. , 2010, Microbiology.

[410]  Tao Xu,et al.  Bioinformatics Applications Note Sequence Analysis Xdia: Improving on the Label-free Data-independent Analysis , 2022 .

[411]  G. Timp,et al.  Nanopore Sequencing: Electrical Measurements of the Code of Life , 2010, IEEE Transactions on Nanotechnology.

[412]  S. Mohammed,et al.  Phosphopeptide Fragmentation and Analysis by Mass Spectrometry , 2010 .

[413]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[414]  J. Kwon Analysis of membrane proteome by data-dependent LC-MS/MS combined with data-independent LC-MSE technique , 2010 .

[415]  Lennart Martens,et al.  jmzML, an open‐source Java API for mzML, the PSI standard for MS data , 2010, Proteomics.

[416]  K. Resing,et al.  IsoformResolver: A Peptide-Centric Algorithm for Protein Inference , 2011, Journal of proteome research.

[417]  James C. Wright,et al.  Shotgun proteomics aids discovery of novel protein-coding genes, alternative splicing, and "resurrected" pseudogenes in the mouse genome. , 2011, Genome research.

[418]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.

[419]  Mark Yandell,et al.  MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects , 2011, BMC Bioinformatics.

[420]  Samuel H. Payne,et al.  Proteogenomic Analysis of Bacteria and Archaea: A 46 Organism Case Study , 2011, PloS one.

[421]  Birgit Schilling,et al.  ScanRanker: Quality assessment of tandem mass spectra via sequence tagging. , 2010, Journal of proteome research.

[422]  Nandini A. Sahasrabuddhe,et al.  A proteogenomic analysis of Anopheles gambiae using high-resolution Fourier transform mass spectrometry. , 2011, Genome research.

[423]  T. Tatusova,et al.  Solving the Problem: Genome Annotation Standards before the Data Deluge , 2011, Standards in genomic sciences.

[424]  S. Mohammed,et al.  Improved peptide identification by targeted fragmentation using CID, HCD and ETD on an LTQ-Orbitrap Velos. , 2011, Journal of proteome research.

[425]  P. Bourne,et al.  Peptide Identification by Database Search of Mixture Tandem Mass Spectra* , 2011, Molecular & Cellular Proteomics.

[426]  J. Armengaud,et al.  Comparative Proteogenomics of Twelve Roseobacter Exoproteomes Reveals Different Adaptive Strategies Among These Marine Bacteria* , 2011, Molecular & Cellular Proteomics.

[427]  Cole Trapnell,et al.  Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. , 2011, Genes & development.

[428]  Pavel A. Pevzner,et al.  Spectral Archives: Extending Spectral Libraries to Analyze both Identified and Unidentified Spectra , 2011, Nature Methods.

[429]  W. Pao,et al.  A Bioinformatics Workflow for Variant Peptide Detection in Shotgun Proteomics* , 2011, Molecular & Cellular Proteomics.

[430]  J. Armengaud,et al.  High-throughput proteogenomics of Ruegeria pomeroyi: seeding a better genomic annotation for the whole marine Roseobacter clade , 2012, BMC Genomics.

[431]  J. Baur,et al.  Resveratrol and life extension , 2011, Annals of the New York Academy of Sciences.

[432]  D. Goodlett,et al.  Faster, quantitative, and accurate precursor acquisition independent from ion count. , 2011, Analytical chemistry.

[433]  Fabian J. Theis,et al.  MIPS: curated databases and comprehensive secondary data resources in 2010 , 2010, Nucleic Acids Res..

[434]  P. Stadler,et al.  The enigmatic mitochondrial genome of Rhabdopleura compacta (Pterobranchia) reveals insights into selection of an efficient tRNA system and supports monophyly of Ambulacraria , 2011, BMC Evolutionary Biology.

[435]  T. Arnesen Towards a Functional Understanding of Protein N-Terminal Acetylation , 2011, PLoS biology.

[436]  P. Pevzner,et al.  Target-Decoy Approach and False Discovery Rate: When Things May Go Wrong , 2011, Journal of the American Society for Mass Spectrometry.

[437]  Eunok Paek,et al.  Fast Multi-blind Modification Search through Tandem Mass Spectrometry* , 2011, Molecular & Cellular Proteomics.

[438]  Amit Kumar Yadav,et al.  MassWiz: a novel scoring algorithm with target-decoy based analysis pipeline for tandem mass spectrometry. , 2011, Journal of proteome research.

[439]  Adam Hunter,et al.  Yabi: An online research environment for grid, high performance and cloud computing , 2012, Source Code for Biology and Medicine.

[440]  Arthur W. Toga,et al.  Applications of the pipeline environment for visual informatics and genomics computations , 2011, BMC Bioinformatics.

[441]  Richard D. LeDuc,et al.  Mapping Intact Protein Isoforms in Discovery Mode Using Top Down Proteomics , 2011, Nature.

[442]  M. Mann,et al.  Andromeda: a peptide search engine integrated into the MaxQuant environment. , 2011, Journal of proteome research.

[443]  J. Dickerson,et al.  Comparative analysis of grapevine whole-genome gene predictions, functional annotation, categorization and integration of the predicted gene sequences , 2012, BMC Research Notes.

[444]  B. Simons,et al.  Performance characteristics of a new hybrid quadrupole time-of-flight tandem mass spectrometer (TripleTOF 5600). , 2011, Analytical chemistry.

[445]  Raymond K. Auerbach,et al.  A User's Guide to the Encyclopedia of DNA Elements (ENCODE) , 2011, PLoS biology.

[446]  Bernard P. Puc,et al.  An integrated semiconductor device enabling non-optical genome sequencing , 2011, Nature.

[447]  Henry H. N. Lam Spectral archives: a vision for future proteomics data repositories , 2011, Nature Methods.

[448]  Lin Liu,et al.  Comparison of Next-Generation Sequencing Systems , 2012, Journal of biomedicine & biotechnology.

[449]  Nuno Bandeira,et al.  False discovery rates in spectral identification , 2012, BMC Bioinformatics.

[450]  Y. Benjamini,et al.  Summarizing and correcting the GC content bias in high-throughput sequencing , 2012, Nucleic acids research.

[451]  S. Hubbard,et al.  Addressing Statistical Biases in Nucleotide-Derived Protein Databases for Proteogenomic Search Strategies , 2012, Journal of proteome research.

[452]  H. Swerdlow,et al.  A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers , 2012, BMC Genomics.

[453]  Christopher W. Maier,et al.  GOFAST: an integrated approach for efficient and comprehensive membrane proteome analysis. , 2012, Analytical chemistry.

[454]  David G. Knowles,et al.  The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression , 2012, Genome research.

[455]  M. Yandell,et al.  A beginner's guide to eukaryotic genome annotation , 2012, Nature Reviews Genetics.

[456]  Chad R. Weisbrod,et al.  Accurate peptide fragment mass analysis: multiplexed peptide identification and quantification. , 2012, Journal of proteome research.

[457]  P. Kersey,et al.  Analysis of the bread wheat genome using whole genome shotgun sequencing , 2012, Nature.

[458]  C. Dekker,et al.  DNA sequencing with nanopores , 2012, Nature Biotechnology.

[459]  E. Pennisi Genomics. ENCODE project writes eulogy for junk DNA. , 2012, Science.

[460]  N. Kelleher,et al.  Spinning up mass spectrometry for whole protein complexes , 2012, Nature Methods.

[461]  M. Tomita,et al.  Mass spectrum sequential subtraction speeds up searching large peptide MS/MS spectra datasets against large nucleotide databases for proteogenomics , 2012, Genes to cells : devoted to molecular & cellular mechanisms.

[462]  Bryan P. Early,et al.  A Protease for Middle Down Proteomics , 2012, Nature Methods.

[463]  David L Tabb,et al.  Pepitome: evaluating improved spectral library search for identification complementarity and quality assessment. , 2012, Journal of proteome research.

[464]  Cole Trapnell,et al.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions , 2013, Genome Biology.

[465]  Natalie I. Tasman,et al.  A Cross-platform Toolkit for Mass Spectrometry and Proteomics , 2012, Nature Biotechnology.

[466]  N. Castellana Proteogenomics : applications of mass spectrometry at the interface of genomics and proteomics , 2012 .

[467]  Morgan C. Giddings,et al.  Whole human genome proteogenomic mapping for ENCODE cell line data: identifying protein-coding regions , 2013, BMC Genomics.

[468]  Johannes Griss,et al.  jmzReader: A Java parser library to process and visualize multiple text and XML-based mass spectrometry data formats , 2012, Proteomics.

[469]  Ying S. Ting,et al.  Protein Identification Using Top-Down Spectra* , 2012, Molecular & Cellular Proteomics.

[470]  P. Dorrestein,et al.  The spectral networks paradigm in high throughput mass spectrometry. , 2012, Molecular bioSystems.

[471]  Juan Antonio Vizcaíno,et al.  jmzIdentML API: A Java interface to the mzIdentML standard for peptide and protein identification data , 2012, Proteomics.

[472]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[473]  Ludovic C. Gillet,et al.  Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis* , 2012, Molecular & Cellular Proteomics.

[474]  T. Flutre,et al.  TriAnnot: A Versatile and High Performance Pipeline for the Automated Annotation of Plant Genomes , 2012, Front. Plant Sci..

[475]  K. Clauser,et al.  Shotgun Protein Sequencing with Meta-contig Assembly* , 2012, Molecular & Cellular Proteomics.

[476]  Matthias Hein,et al.  Isotope pattern deconvolution for peptide mass spectrometry by non-negative least squares/least absolute deviation template matching , 2012, BMC Bioinformatics.

[477]  Alexandra M. E. Jones,et al.  The Ph1 Locus Suppresses Cdk2-Type Activity during Premeiosis and Meiosis in Wheat[W][OA] , 2012, Plant Cell.

[478]  E. Hayden Nanopore genome sequencer makes its debut , 2012 .

[479]  J. Thelen,et al.  The proteomic future: where mass spectrometry should be taking us. , 2012, The Biochemical journal.

[480]  M. Delledonne,et al.  De novo transcriptome characterization of Vitis vinifera cv. Corvina unveils varietal diversity , 2013, BMC Genomics.

[481]  Martin Renqiang Min,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[482]  M. Mann,et al.  Quantification of the N-glycosylated Secretome by Super-SILAC During Breast Cancer Progression and in Human Blood Samples , 2012, Molecular & Cellular Proteomics.

[483]  Nobel C. Zong,et al.  Integration of Cardiac Proteome Biology and Medicine by a Specialized Knowledgebase , 2013, Circulation research.

[484]  Shivashankar H. Nagaraj,et al.  Proteogenomic Analysis of Bradyrhizobium japonicum USDA110 Using Genosuite, an Automated Multi-algorithmic Pipeline* , 2013, Molecular & Cellular Proteomics.

[485]  L. Gatto,et al.  Effects of traveling wave ion mobility separation on data independent acquisition in proteomics studies. , 2013, Journal of proteome research.

[486]  M. Mann,et al.  The coming age of complete, accurate, and ubiquitous proteomes. , 2013, Molecular cell.

[487]  Laurent Gatto,et al.  Improving qualitative and quantitative performance for MS(E)-based label-free proteomics. , 2013, Journal of proteome research.

[488]  G. Cramer,et al.  Proteomic analysis indicates massive changes in metabolism prior to the inhibition of growth and photosynthesis of grapevine (Vitis vinifera L.) in response to water deficit , 2013, BMC Plant Biology.

[489]  Tsunglin Liu,et al.  Effects of GC Bias in Next-Generation-Sequencing Data on De Novo Genome Assembly , 2013, PloS one.

[490]  Paul Blakeley Computational proteomics for genome annotation , 2013 .

[491]  B. Kuster,et al.  MScDB: a mass spectrometry-centric protein sequence database for proteomics. , 2013, Journal of proteome research.

[492]  M. Mann,et al.  Proteomic workflow for analysis of archival formalin‐fixed and paraffin‐embedded clinical samples to a depth of 10 000 proteins , 2013, Proteomics. Clinical applications.

[493]  Rachel M. Adams,et al.  Moving away from the reference genome: evaluating a peptide sequencing tagging approach for single amino acid polymorphism identifications in the genus Populus. , 2013, Journal of proteome research.

[494]  D. Posada,et al.  Origin and Length Distribution of Unidirectional Prokaryotic Overlapping Genes , 2013, G3: Genes, Genomes, Genetics.

[495]  M. Bellgard,et al.  Classification of fish samples via an integrated proteomics and bioinformatics approach , 2013, Proteomics.

[496]  J. D. Sandoval,et al.  STEPS: A grid search methodology for optimized peptide identification filtering of MS/MS database search results , 2013, Proteomics.

[497]  N. Castellana,et al.  Plant proteogenomics: from protein extraction to improved gene predictions. , 2013, Methods in molecular biology.

[498]  M. Mann,et al.  In vivo SILAC-based proteomics reveals phosphoproteome changes during mouse skin carcinogenesis. , 2013, Cell reports.

[499]  Alfonso Valencia,et al.  ChiTaRS: a database of human, mouse and fruit fly chimeric transcripts and RNA-sequencing data , 2012, Nucleic Acids Res..

[500]  Pavel A. Pevzner,et al.  UniNovo: A Universal Tool for de Novo Peptide Sequencing , 2013, RECOMB.

[501]  Fang-Xiang Wu,et al.  An improved peptide-spectral matching algorithm through distributed search over multiple cores and multiple CPUs , 2014, Proteome Science.

[502]  Joel A. Kooren,et al.  A two‐step database search method improves sensitivity in peptide sequence matches for metaproteomics and proteogenomics studies , 2013, Proteomics.

[503]  C. Bertelli,et al.  Rapid bacterial genome sequencing: methods and applications in clinical microbiology. , 2013, Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases.

[504]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[505]  Xiaojing Wang,et al.  customProDB: an R package to generate customized protein databases from RNA-Seq data for proteomics search , 2013, Bioinform..

[506]  Francisco J. Planes,et al.  Bioinformatic progress and applications in metaproteogenomics for bridging the gap between genomic sequences and metabolic functions in microbial communities , 2013, Proteomics.

[507]  Ludovic C. Gillet,et al.  Quantitative measurements of N‐linked glycoproteins in human plasma by SWATH‐MS , 2013, Proteomics.

[508]  Eric W. Deutsch,et al.  Combining Results of Multiple Search Engines in Proteomics* , 2013, Molecular & Cellular Proteomics.

[509]  Cheng-Yan Kao,et al.  SeqEntropy: Genome-Wide Assessment of Repeats for Short Read Sequencing , 2013, PloS one.

[510]  Gordon Gremme,et al.  GenomeTools: A Comprehensive Software Library for Efficient Processing of Structured Genome Annotations , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[511]  Brian L. Frey,et al.  Discovery and Mass Spectrometric Analysis of Novel Splice-junction Peptides Using RNA-Seq* , 2013, Molecular & Cellular Proteomics.

[512]  P. Pevzner,et al.  Identification of ultramodified proteins using top-down tandem mass spectra. , 2013, Journal of proteome research.

[513]  G. Valle,et al.  A deep survey of alternative splicing in grape reveals changes in the splicing machinery related to tissue, stress condition and genotype , 2014, BMC Plant Biology.

[514]  J. Eng,et al.  Comet: An open‐source MS/MS sequence database search tool , 2013, Proteomics.

[515]  Morgan C. Giddings,et al.  Peppy: proteogenomic search software. , 2013, Journal of proteome research.

[516]  G. Church,et al.  Cas9 as a versatile tool for engineering biology , 2013, Nature Methods.

[517]  Kai Pong Law,et al.  Recent advances in mass spectrometry: data independent analysis and hyper reaction monitoring , 2013, Expert review of proteomics.

[518]  Morgan C. Giddings,et al.  A peptide-spectrum scoring system based on ion alignment, intensity, and pair probabilities. , 2013, Journal of proteome research.

[519]  K. Gevaert,et al.  Deep Proteome Coverage Based on Ribosome Profiling Aids Mass Spectrometry-based Protein and Peptide Discovery and Provides Evidence of Alternative Translation Products and Near-cognate Translation Initiation Events* , 2013, Molecular & Cellular Proteomics.

[520]  N. Kelleher,et al.  The emergence of top-down proteomics in clinical research , 2013, Genome Medicine.

[521]  K. Clauser,et al.  Sequencing-grade de novo analysis of MS/MS triplets (CID/HCD/ETD) from overlapping peptides. , 2013, Journal of proteome research.

[522]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration , 2012, Briefings Bioinform..

[523]  California Jack Cassidy,et al.  An Automated Proteogenomic Method Uses Mass Spectrometry to Reveal Novel Genes in Zea mays* , 2013, Molecular & Cellular Proteomics.

[524]  Jarrett D. Egertson,et al.  Multiplexed MS/MS for Improved Data Independent Acquisition , 2013, Nature Methods.

[525]  Ernesto Picardi,et al.  REDItools: high-throughput RNA editing detection made easy , 2013, Bioinform..

[526]  Dylan J. Sorensen,et al.  Label-Free Quantitation and Mapping of the ErbB2 Tumor Receptor by Multiple Protease Digestion with Data-Dependent (MS1) and Data-Independent (MS2) Acquisitions , 2013, International journal of proteomics.

[527]  Nuno Bandeira,et al.  Spectral Library Generating Function for Assessing Spectrum-Spectrum Match Significance , 2013, RECOMB.

[528]  Thomas R. Gingeras,et al.  Comment on “TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions” by Kim et al. , 2013 .

[529]  Rob Smith,et al.  Mspire-Simulator: LC-MS shotgun proteomic simulator for creating realistic gold standard data. , 2013, Journal of proteome research.

[530]  B. Maček,et al.  Deep Coverage of the Escherichia coli Proteome Enables the Assessment of False Discovery Rates in Simple Proteogenomic Experiments* , 2013, Molecular & Cellular Proteomics.

[531]  James E. Johnson,et al.  Using Galaxy-P to leverage RNA-Seq for the discovery of novel protein variations , 2014, BMC Genomics.

[532]  Karl G. Kugler,et al.  Genome interplay in the grain transcriptome of hexaploid bread wheat , 2014, Science.

[533]  Andrew R. Jones,et al.  ProteomeXchange provides globally co-ordinated proteomics data submission and dissemination , 2014, Nature Biotechnology.

[534]  Gary D Bader,et al.  A draft map of the human proteome , 2014, Nature.

[535]  M. Huss,et al.  HiRIEF LC-MS enables deep proteome coverage and unbiased proteogenomics , 2013, Nature Methods.

[536]  Samuel H. Payne,et al.  Proteogenomic strategies for identification of aberrant cancer peptides using large‐scale next‐generation sequencing data , 2014, Proteomics.

[537]  Wei Wu,et al.  NONCODEv4: exploring the world of long non-coding RNA genes , 2013, Nucleic Acids Res..

[538]  P. Pevzner,et al.  De novo protein sequencing by combining top-down and bottom-up tandem mass spectra. , 2014, Journal of proteome research.

[539]  Torsten Seemann,et al.  Prokka: rapid prokaryotic genome annotation , 2014, Bioinform..

[540]  D. Goodlett,et al.  Multiplexed and data-independent tandem mass spectrometry for global proteome profiling. , 2014, Mass spectrometry reviews.

[541]  James E. Johnson,et al.  Flexible and Accessible Workflows for Improved Proteogenomic Analysis Using the Galaxy Framework , 2014, Journal of proteome research.

[542]  M. Wilkins,et al.  Tools to covisualize and coanalyze proteomic data with genomes and transcriptomes: validation of genes and alternative mRNA splicing. , 2014, Journal of proteome research.

[543]  M. Bellgard,et al.  High‐throughput parallel proteogenomics: A bacterial case study , 2014, Proteomics.

[544]  Ying Jiang,et al.  Discovery of novel genes and gene isoforms by integrating transcriptomic and proteomic profiling from mouse liver. , 2014, Journal of proteome research.

[545]  J. Armengaud,et al.  Non-model organisms, a species endangered by proteogenomics. , 2014, Journal of proteomics.

[546]  P. Bourne,et al.  MixGF: Spectral Probabilities for Mixture Spectra from more than One Peptide* , 2014, Molecular & Cellular Proteomics.

[547]  J. Batley,et al.  A chromosome-based draft sequence of the hexaploid bread wheat (Triticum aestivum) genome , 2014, Science.

[548]  Andrey Tovchigrechko,et al.  PGP: parallel prokaryotic proteogenomics pipeline for MPI clusters, high-throughput batch clusters and multicore workstations , 2014, Bioinform..

[549]  Rick L. Stevens,et al.  High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource , 2014, Proceedings of the National Academy of Sciences.

[550]  B. Shen,et al.  A proteogenomics approach integrating proteomics and ribosome profiling increases the efficiency of protein identification and enables the discovery of alternative translation start sites , 2014, Proteomics.

[551]  J. Armengaud,et al.  N‐terminomics and proteogenomics, getting off to a good start , 2014, Proteomics.

[552]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[553]  B. Kuster,et al.  Mass-spectrometry-based draft of the human proteome , 2014, Nature.

[554]  Alla Lapidus,et al.  ExSPAnder: a universal repeat resolver for DNA fragment assembly , 2014, Bioinform..

[555]  Gennifer E. Merrihew,et al.  Proteogenomic database construction driven from large scale RNA-seq data. , 2014, Journal of proteome research.

[556]  Shivashankar H. Nagaraj,et al.  PGTools: A Software Suite for Proteogenomic Data Analysis and Visualization. , 2015, Journal of proteome research.

[557]  Evgeny M. Zdobnov,et al.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs , 2015, Bioinform..