Proteogenomics for personalised molecular profiling

This work was supported by NIH grant ( U41HG007234 ) to the GENCODE project and Wellcome Trust grant ( WT098051 ) to the Sanger Institute.

[1]  S. P. Fodor,et al.  Large-Scale Transcriptional Activity in Chromosomes 21 and 22 , 2002, Science.

[2]  Richard D. Smith,et al.  Detecting differential protein expression in large-scale population proteomics , 2014, Bioinform..

[3]  James C. Wright,et al.  DecoyPyrat: Fast Non-redundant Hybrid Decoy Sequence Generation for Large Scale Proteomics. , 2016, Journal of proteomics & bioinformatics.

[4]  Michael L. Gatza,et al.  Proteogenomics connects somatic mutations to signaling in breast cancer , 2016, Nature.

[5]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration , 2012, Briefings Bioinform..

[6]  Rob Patro,et al.  Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms , 2013, Nature Biotechnology.

[7]  James E. Johnson,et al.  Using Galaxy-P to leverage RNA-Seq for the discovery of novel protein variations , 2014, BMC Genomics.

[8]  D. Scott,et al.  Optimization and testing of mass spectral library search algorithms for compound identification , 1994, Journal of the American Society for Mass Spectrometry.

[9]  Scott R. Kennedy,et al.  Somatic mutations in aging, cancer and neurodegeneration , 2012, Mechanisms of Ageing and Development.

[10]  B. Kuster,et al.  Mass-spectrometry-based draft of the human proteome , 2014, Nature.

[11]  J. Kroin,et al.  A current review of molecular mechanisms regarding osteoarthritis and pain. , 2013, Gene.

[12]  K. Mullis,et al.  Specific synthesis of DNA in vitro via a polymerase-catalyzed chain reaction. , 1987, Methods in enzymology.

[13]  F. Lisacek,et al.  Pathway analysis and transcriptomics improve protein identification by shotgun proteomics from samples comprising small number of cells - a benchmarking study , 2014, BMC Genomics.

[14]  L. Reynard,et al.  The genetics and functional analysis of primary osteoarthritis susceptibility , 2013, Expert Reviews in Molecular Medicine.

[15]  K. Tomczak,et al.  The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge , 2015, Contemporary oncology.

[16]  Markus Brosch,et al.  Enhanced Peptide Identification by Electron Transfer Dissociation Using an Improved Mascot Percolator* , 2012, Molecular & Cellular Proteomics.

[17]  M. Snyder,et al.  iPOP goes the world: integrated personalized Omics profiling and the road toward improved health care. , 2013, Chemistry & biology.

[18]  A. Hinnebusch,et al.  Regulation of Translation Initiation in Eukaryotes: Mechanisms and Biological Targets , 2009, Cell.

[19]  Anthony J. Cesnik,et al.  Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation. , 2016, Annual review of analytical chemistry.

[20]  K. Resing,et al.  Mapping protein post-translational modifications with mass spectrometry , 2007, Nature Methods.

[21]  Karsten Krug,et al.  Construction and assessment of individualized proteogenomic databases for large‐scale analysis of nonsynonymous single nucleotide variants , 2014, Proteomics.

[22]  D. Figeys,et al.  Peptide-Centric Approaches Provide an Alternative Perspective To Re-Examine Quantitative Proteomic Data. , 2016, Analytical chemistry.

[23]  Martin Kollmar,et al.  A novel hybrid gene prediction method employing protein multiple sequence alignments , 2011, Bioinform..

[24]  Eunok Paek,et al.  CIFTER: automated charge-state determination for peptide tandem mass spectra. , 2008, Analytical chemistry.

[25]  Genomic determinants of protein abundance variation in colorectal cancer cells , 2016, bioRxiv.

[26]  T. Köcher,et al.  Universal and confident phosphorylation site localization using phosphoRS. , 2011, Journal of proteome research.

[27]  Luis Serrano,et al.  Correlation of mRNA and protein in complex biological samples , 2009, FEBS letters.

[28]  Ruedi Aebersold,et al.  Building consensus spectral libraries for peptide identification in proteomics , 2008, Nature Methods.

[29]  Leo C. McHugh,et al.  Computational Methods for Protein Identification from Mass Spectrometry Data , 2008, PLoS Comput. Biol..

[30]  Predrag Radivojac,et al.  Computational approaches to protein inference in shotgun proteomics , 2012, BMC Bioinformatics.

[31]  W. Pao,et al.  A Bioinformatics Workflow for Variant Peptide Detection in Shotgun Proteomics* , 2011, Molecular & Cellular Proteomics.

[32]  Mikhail M Savitski,et al.  ModifiComb, a New Proteomic Tool for Mapping Substoichiometric Post-translational Modifications, Finding Novel Types of Modifications, and Fingerprinting Complex Protein Mixtures* , 2006, Molecular & Cellular Proteomics.

[33]  Alexey I Nesvizhskii,et al.  MSFragger: ultrafast and comprehensive peptide identification in shotgun proteomics , 2017, Nature Methods.

[34]  J. Woessner,et al.  Role of metalloproteinases in human osteoarthritis. , 1991, The Journal of rheumatology. Supplement.

[35]  Steven A Carr,et al.  Integrated proteomic analysis of post-translational modifications by serial enrichment , 2013, Nature Methods.

[36]  Bing Zhang,et al.  Leveraging the complementary nature of RNA‐Seq and shotgun proteomics data , 2014, Proteomics.

[37]  Morgan C. Giddings,et al.  Whole human genome proteogenomic mapping for ENCODE cell line data: identifying protein-coding regions , 2013, BMC Genomics.

[38]  Juan Antonio Vizcaíno,et al.  ms-data-core-api: an open-source, metadata-oriented library for computational proteomics , 2015, Bioinform..

[39]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[40]  F Wold,et al.  In vivo chemical modification of proteins (post-translational modification). , 1981, Annual review of biochemistry.

[41]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[42]  M. Dhaenens,et al.  Minimizing technical variation during sample preparation prior to label-free quantitative mass spectrometry. , 2015, Analytical biochemistry.

[43]  Birgit Schilling,et al.  Repeatability and reproducibility in proteomic identifications by liquid chromatography-tandem mass spectrometry. , 2010, Journal of proteome research.

[44]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[45]  Ting Wang,et al.  Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser , 2013, Bioinform..

[46]  Heng Zhu,et al.  Overview of Protein Microarrays , 2013, Current protocols in protein science.

[47]  E. Marcotte,et al.  Insights into the regulation of protein abundance from proteomic and transcriptomic analyses , 2012, Nature Reviews Genetics.

[48]  Jun S. Liu,et al.  The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans , 2015, Science.

[49]  Gordon A Anderson,et al.  Liquid Chromatography Mass Spectrometry-Based Proteomics: Biological and Technological Aspects. , 2010, The annals of applied statistics.

[50]  T. Spector,et al.  Genetic epidemiology of hip and knee osteoarthritis , 2011, Nature Reviews Rheumatology.

[51]  A. Gasch,et al.  Molecular Systems Biology Peer Review Process File a Dynamic Model of Proteome Changes Reveals New Roles for Transcript Alteration in Yeast Transaction Report , 2022 .

[52]  David L. Tabb,et al.  proBAMsuite, a Bioinformatics Framework for Genome-Based Representation and Analysis of Proteomics Data* , 2015, Molecular & Cellular Proteomics.

[53]  Dexter T. Duncan,et al.  CanProVar: a human cancer proteome variation database , 2010, Human mutation.

[54]  Hongye Li,et al.  Evolution of Gene Regulation during Transcription and Translation , 2015, Genome biology and evolution.

[55]  Jie Zhou,et al.  RNA-seq differential expression studies: more sequence or more replication? , 2014, Bioinform..

[56]  B. van Weemen,et al.  Immunoassay using antigen—enzyme conjugates , 1971, FEBS letters.

[57]  J. Shaffer Multiple Hypothesis Testing , 1995 .

[58]  C. Malemud,et al.  Prospects for treating osteoarthritis: enzyme–protein interactions regulating matrix metalloproteinase activity , 2012, Therapeutic advances in chronic disease.

[59]  Jennifer M. Bolin,et al.  Proteomic and phosphoproteomic comparison of human ES and iPS cells , 2011, Nature Methods.

[60]  Stephen Stein,et al.  Mass spectral reference libraries: an ever-expanding resource for chemical identification. , 2012, Analytical chemistry.

[61]  Yolande F M Ramos,et al.  The role of epigenetics in osteoarthritis: current perspective , 2017, Current opinion in rheumatology.

[62]  M. Mann,et al.  Global, In Vivo, and Site-Specific Phosphorylation Dynamics in Signaling Networks , 2006, Cell.

[63]  Dan Xie,et al.  Variation and Genetic Control of Protein Abundance in Humans , 2013, Nature.

[64]  Hokeun Kim,et al.  Compact variant‐rich customized sequence database and a fast and sensitive database search for efficient proteogenomic analyses , 2014, Proteomics.

[65]  A. Noël,et al.  Gene Expression Pattern of Cells From Inflamed and Normal Areas of Osteoarthritis Synovial Membrane , 2014, Arthritis & rheumatology.

[66]  Sebastian Böcker,et al.  Computational mass spectrometry for small molecules , 2013, Journal of Cheminformatics.

[67]  Nuno Bandeira,et al.  False discovery rates in spectral identification , 2012, BMC Bioinformatics.

[68]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[69]  R. Pearson,et al.  Histopathology grading systems for characterisation of human knee osteoarthritis--reproducibility, variability, reliability, correlation, and validity. , 2011, Osteoarthritis and cartilage.

[70]  M. Mocanu,et al.  Mass Spectrometry for Post-Translational Modifications , 2010 .

[71]  Bernhard Y. Renard,et al.  iPiG: Integrating Peptide Spectrum Matches into Genome Browser Visualizations , 2012, PloS one.

[72]  M. Wilkins,et al.  Tools to covisualize and coanalyze proteomic data with genomes and transcriptomes: validation of genes and alternative mRNA splicing. , 2014, Journal of proteome research.

[73]  Ruedi Aebersold,et al.  Artificial decoy spectral libraries for false discovery rate estimation in spectral library searching in proteomics. , 2010, Journal of proteome research.

[74]  Nichole L. King,et al.  Development and validation of a spectral library searching method for peptide identification from MS/MS , 2007, Proteomics.

[75]  David John,et al.  The identification and characterization of novel transcripts from RNA-seq data , 2016, Briefings Bioinform..

[76]  J. Rinn,et al.  The transcriptional activity of human Chromosome 22. , 2003, Genes & development.

[77]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[78]  Pavel A. Pevzner,et al.  Universal database search tool for proteomics , 2014, Nature Communications.

[79]  E. Zeggini,et al.  Functional genomics in osteoarthritis: Past, present, and future , 2016, Journal of orthopaedic research : official publication of the Orthopaedic Research Society.

[80]  K. Parker,et al.  Multiplexed Protein Quantitation in Saccharomyces cerevisiae Using Amine-reactive Isobaric Tagging Reagents*S , 2004, Molecular & Cellular Proteomics.

[81]  Srikanth S. Manda,et al.  Identification and characterization of proteins encoded by chromosome 12 as part of chromosome-centric human proteome project. , 2014, Journal of proteome research.

[82]  Jacob D. Jaffe,et al.  Proteogenomic mapping as a complementary method to perform genome annotation , 2004, Proteomics.

[83]  Daniel R. Zerbino,et al.  Ensembl 2016 , 2015, Nucleic Acids Res..

[84]  William Stafford Noble,et al.  Fast and Accurate Protein False Discovery Rates on Large-Scale Proteomics Data Sets with Percolator 3.0 , 2016, Journal of The American Society for Mass Spectrometry.

[85]  Cole Trapnell,et al.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions , 2013, Genome Biology.

[86]  J. DeLeo,et al.  Sequential comparative hybridizations analyzed by computerized image processing can identify and quantitate regulated RNAs. , 1983, DNA.

[87]  George I. Mias,et al.  Personal genomes, quantitative dynamic omics and personalized medicine , 2013, Quantitative Biology.

[88]  E. W. McDaniel,et al.  Electrospray Ion Source. Another Variation on the Free-Jet Theme , 1984 .

[89]  Roger E. Moore,et al.  Qscore: An algorithm for evaluating SEQUEST database search results , 2002, Journal of the American Society for Mass Spectrometry.

[90]  James C. Wright,et al.  Flexible Data Analysis Pipeline for High-Confidence Proteogenomics , 2016, Journal of proteome research.

[91]  Morgan C. Giddings,et al.  Peppy: proteogenomic search software. , 2013, Journal of proteome research.

[92]  Andreas Bender,et al.  Fast, Quantitative and Variant Enabled Mapping of Peptides to Genomes , 2017, Cell systems.

[93]  H. Dorfman,et al.  Biochemical and metabolic abnormalities in articular cartilage from osteo-arthritic human hips. II. Correlation of morphology with biochemical and metabolic data. , 1971, The Journal of bone and joint surgery. American volume.

[94]  Yu-Chieh Wang,et al.  Protein post-translational modifications and regulation of pluripotency in human stem cells , 2013, Cell Research.

[95]  R. Nelson,et al.  Population Proteomics , 2006, Molecular & Cellular Proteomics.

[96]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[97]  Henning Urlaub,et al.  Quantitative Mass Spectrometry-Based Proteomics: An Overview , 2012, Quantitative Methods in Proteomics.

[98]  Andrew H. Thompson,et al.  Tandem mass tags: a novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. , 2003, Analytical chemistry.

[99]  Brian L. Frey,et al.  Global Identification of Protein Post-translational Modifications in a Single-Pass Database Search , 2015, Journal of proteome research.

[100]  Heejin Park,et al.  Unrestrictive Identification of Multiple Post-translational Modifications from Tandem Mass Spectrometry Using an Error-tolerant Algorithm Based on an Extended Sequence Tag Approach*S , 2008, Molecular & Cellular Proteomics.

[101]  Mario Stanke,et al.  Gene prediction with a hidden Markov model and a new intron submodel , 2003, ECCB.

[102]  Fatih Ozsolak,et al.  RNA sequencing: advances, challenges and opportunities , 2011, Nature Reviews Genetics.

[103]  J. Harrow,et al.  Assessment of transcript reconstruction methods for RNA-seq , 2013, Nature Methods.

[104]  Meghan C. Burke,et al.  Reverse and Random Decoy Methods for False Discovery Rate Estimation in High Mass Accuracy Peptide Spectral Library Searches. , 2018, Journal of proteome research.

[105]  K. Sirotkin,et al.  dbSNP-database for single nucleotide polymorphisms and other classes of minor genetic variation. , 1999, Genome research.

[106]  Mingyao Li,et al.  Evaluating the Impact of Sequencing Depth on Transcriptome Profiling in Human Adipose , 2013, PloS one.

[107]  B. Searle,et al.  “Plug-and-play” investigation of the human phosphoproteome by targeted high-resolution mass spectrometry , 2016, Nature Methods.

[108]  Alfonso Valencia,et al.  APPRIS: annotation of principal and alternative splice isoforms , 2012, Nucleic Acids Res..

[109]  A. Nesvizhskii A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. , 2010, Journal of proteomics.

[110]  P. Roughley,et al.  Matrix metalloproteinases cleave at two distinct sites on human cartilage link protein. , 1993, The Biochemical journal.

[111]  Kyu-Baek Hwang,et al.  Evaluating the effect of database inflation in proteogenomic search on sensitive and reliable peptide identification , 2016, BMC Genomics.

[112]  David Fenyö,et al.  g2pDB: A Database Mapping Protein Post-Translational Modifications to Genomic Coordinates. , 2016, Journal of proteome research.

[113]  Motonori Ota,et al.  The Protein Mutant Database , 1999, Nucleic Acids Res..

[114]  Predrag Radivojac,et al.  Extending the coverage of spectral libraries: A neighbor‐based approach to predicting intensities of peptide fragmentation spectra , 2013, Proteomics.

[115]  Gary D Bader,et al.  A draft map of the human proteome , 2014, Nature.

[116]  David R. Kelley,et al.  Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks , 2012, Nature Protocols.

[117]  Osteoarthritis Pathogenesis: A Review of Molecular Mechanisms , 2014, Calcified Tissue International.

[118]  R. Guigó,et al.  Improving gene annotation using peptide mass spectrometry. , 2007, Genome research.

[119]  Steven P. Gygi,et al.  A Triple Knockout (TKO) Proteomics Standard for Diagnosing Ion Interference in Isobaric Labeling Experiments , 2016, Journal of The American Society for Mass Spectrometry.

[120]  M. Ritchie,et al.  Methods of integrating data to uncover genotype–phenotype interactions , 2015, Nature Reviews Genetics.

[121]  James C. Wright,et al.  Improving GENCODE reference gene annotation using a high-stringency proteogenomics workflow , 2016, Nature Communications.

[122]  W. B. van den Berg,et al.  Animal models of arthritis in NOS2-deficient mice. , 1999, Osteoarthritis and cartilage.

[123]  Francesco Falciani,et al.  DNA Microarrays: a Powerful Genomic Tool for Biomedical and Clinical Research , 2007, Molecular medicine.

[124]  L. Luzzatto Somatic mutations in cancer development , 2011, Environmental health : a global access science source.

[125]  F. Cunningham,et al.  The Ensembl Variant Effect Predictor , 2016, Genome Biology.

[126]  O. Jensen Interpreting the protein language using proteomics , 2006, Nature Reviews Molecular Cell Biology.

[127]  F. Blanco,et al.  Lessons from the proteomic study of osteoarthritis , 2015, Expert review of proteomics.

[128]  Aaron A. Klammer,et al.  Effects of modified digestion schemes on the identification of proteins from complex mixtures. , 2006, Journal of proteome research.

[129]  Wei Sun,et al.  Statistical characterization of HCD fragmentation patterns of tryptic peptides on an LTQ Orbitrap Velos mass spectrometer. , 2014, Journal of proteomics.

[130]  Brandon M. Malone,et al.  The Proteogenomic Mapping Tool , 2011, BMC Bioinformatics.

[131]  Hui Shen,et al.  Comprehensive Characterization of Human Genome Variation by High Coverage Whole-Genome Sequencing of Forty Four Caucasians , 2013, PloS one.

[132]  D. Nedelkov Population proteomics: Investigation of protein diversity in human populations , 2008, Proteomics.

[133]  S. Gygi,et al.  ms3 eliminates ratio distortion in isobaric multiplexed quantitative , 2011 .

[134]  A. Nesvizhskii Proteogenomics: concepts, applications and computational strategies , 2014, Nature Methods.

[135]  Edna Schechtman,et al.  Co-occurrence of transcription and translation gene regulatory features underlies coordinated mRNA and protein synthesis , 2014, BMC Genomics.