Multidimensional separation schemes enhance the identification and molecular characterization of low molecular weight proteomes and short open reading frame-encoded peptides in top-down proteomics.

Short open reading frame-encoded peptides (SEP) represent a widely undiscovered part of the proteome. The detailed analysis of SEP has, despite inherent limitations such as incomplete sequence coverage, challenges encountered with protein inference, the identification of posttranslational modifications and the assignment of potential N- and C-terminal truncations, predominantly been assessed using bottom-up proteomic workflows. The use of top-down based proteomic workflows is capable of providing an unparalleled level of characterization information, which is of increased importance in the case of alternatively encoded protein products. However, top-down based analysis is not without its own limitations, for which efficient separation prior to MS analysis is a major issue. We established a sample preparation approach for the combined bottom-up and top-down proteomic analysis of SEP. Key improvements were made by the application of solid phase extraction (SPE), which supported enrichment of proteins below ca. 20 kDa, followed by 2D-LC-MS top-down analysis encompassing both HCD and EThcD ion activation. Bottom-up experiments were used to support and confirm top-down data interpretation. This strategy allowed for the top-down characterization of 36 proteoforms mapping to 12 SEP from the archaeon Methanosarcina mazei strain Gö1, with the concurrent detection and identification of several posttranslational modifications in SEP. BIOLOGICAL SIGNIFICANCE: Small or short open reading frames (sORF) have been widely neglected in genome research in the past. With their increasing discovery, the question about the presence and molecular function of their translation products, the short open reading frame-encoded peptides (SEP), arises. As these small proteins are usually below the 10 kDa range, the number of peptides identifiable by bottom-up proteomics is limited which hampers both the identification and the recognition of potential posttranslational modifications. The presented top-down approach allowed for the detection of full length SEP, as well as of terminally truncated proteoforms, and further enabled the identification of disulfide bonds in these small proteins. This demonstrates, that this yet widely undiscovered part of the proteome undergoes the same modifications as classical proteins which is an essential step for future understanding of the biological functions of these molecules.

[1]  Andrew R. Jones,et al.  ProteomeXchange provides globally co-ordinated proteomics data submission and dissemination , 2014, Nature Biotechnology.

[2]  G. Storz,et al.  Alternative ORFs and small ORFs: shedding light on the dark proteome. , 2019, Nucleic acids research.

[3]  Jun Yu,et al.  A Systematic Survey of Mini-Proteins in Bacteria and Archaea , 2008, PloS one.

[4]  Richard D. LeDuc,et al.  Mapping Intact Protein Isoforms in Discovery Mode Using Top Down Proteomics , 2011, Nature.

[5]  Karl W Barber,et al.  Comparative Proteomics Enables Identification of Nonannotated Cold Shock Proteins in E. coli , 2017, Journal of proteome research.

[6]  K. Kent,et al.  Electron-Transfer/Higher-Energy Collision Dissociation (EThcD)-Enabled Intact Glycopeptide/Glycoproteome Characterization , 2017, Journal of The American Society for Mass Spectrometry.

[7]  Sarah A. Slavoff,et al.  Small open reading frames and cellular stress responses. , 2019, Molecular omics.

[8]  Junpei Suzuki,et al.  PEPPI-MS: polyacrylamide gel-based prefractionation for analysis of intact proteoforms and protein complexes by mass spectrometry. , 2020, Journal of proteome research.

[9]  N. Kelleher,et al.  Progress in Top-Down Proteomics and the Analysis of Proteoforms. , 2016, Annual review of analytical chemistry.

[10]  Juan Pablo Couso,et al.  Discovery and characterization of smORF-encoded bioactive polypeptides. , 2015, Nature chemical biology.

[11]  A. Tholey,et al.  Characterization of post-translational modifications in full-length human BMP-1 confirms the presence of a rare vicinal disulfide linkage in the catalytic domain and highlights novel features of the EGF domain. , 2016, Journal of proteomics.

[12]  Philipp T Kaulich,et al.  Complementarity of Different SDS‐PAGE Gel Staining Methods for the Identification of Short Open Reading Frame‐Encoded Peptides , 2020, Proteomics.

[13]  V. Delcourt,et al.  Optimized Sample Preparation Workflow for Improved Identification of Ghost Proteins. , 2019, Analytical chemistry.

[14]  D. Becher,et al.  First description of small proteins encoded by spRNAs in Methanosarcina mazei strain Gö1. , 2015, Biochimie.

[15]  Michelle S. Scott,et al.  Deep transcriptome annotation enables the discovery and functional characterization of cryptic small proteins , 2017, eLife.

[16]  J. Couso,et al.  Classification and function of small open reading frames , 2017, Nature Reviews Molecular Cell Biology.

[17]  Sebastian Gibb,et al.  Maximizing Sequence Coverage in Top-Down Proteomics By Automated Multimodal Gas-Phase Protein Fragmentation. , 2018, Analytical chemistry.

[18]  J. Vogel,et al.  Deep sequencing analysis of the Methanosarcina mazei Gö1 transcriptome in response to nitrogen availability , 2009, Proceedings of the National Academy of Sciences.

[19]  Richard D. LeDuc,et al.  Defining Gas-Phase Fragmentation Propensities of Intact Proteins During Native Top-Down Mass Spectrometry , 2017, Journal of The American Society for Mass Spectrometry.

[20]  Neil L. Kelleher,et al.  The C-Score: A Bayesian Framework to Sharply Improve Proteoform Scoring in High-Throughput Top Down Proteomics , 2014, Journal of proteome research.

[21]  Jiao Ma,et al.  Discovery of Human sORF-Encoded Polypeptides (SEPs) in Cell Lines and Tissue , 2014, Journal of proteome research.

[22]  J. Rinn,et al.  Peptidomic discovery of short open reading frame-encoded peptides in human cells , 2012, Nature chemical biology.

[23]  Philipp T Kaulich,et al.  Depletion of High-Molecular-Mass Proteins for the Identification of Small Proteins and Short Open Reading Frame Encoded Peptides in Cellular Proteomes. , 2019, Journal of proteome research.

[24]  Henk W. P. van den Toorn,et al.  Toward an Optimized Workflow for Middle-Down Proteomics , 2017, Analytical chemistry.

[25]  Liam Cassidy,et al.  Combination of Bottom-up 2D-LC-MS and Semi-top-down GelFree-LC-MS Enhances Coverage of Proteome and Low Molecular Weight Short Open Reading Frame Encoded Peptides of the Archaeon Methanosarcina mazei. , 2016, Journal of proteome research.

[26]  G. Rainer,et al.  Separation and identification of mouse brain tissue microproteins using top‐down method with high resolution nanocapillary liquid chromatography mass spectrometry , 2017, Proteomics.

[27]  Neil L. Kelleher,et al.  Top-down proteomics reveals novel protein forms expressed in methanosarcina acetivorans , 2009, Journal of the American Society for Mass Spectrometry.

[28]  P. Stadler,et al.  Enrichment and identification of small proteins in a simplified human gut microbiome. , 2019, Journal of proteomics.

[29]  Julie Maupin-Furlow,et al.  Post-translation modification in Archaea: lessons from Haloferax volcanii and other haloarchaea. , 2013, FEMS microbiology reviews.

[30]  R. Overbeek,et al.  The genome of Methanosarcina mazei: evidence for lateral gene transfer between bacteria and archaea. , 2002, Journal of molecular microbiology and biotechnology.

[31]  T. Veenstra,et al.  Characterization of the Low Molecular Weight Human Serum Proteome*S , 2003, Molecular & Cellular Proteomics.

[32]  R. Sorek,et al.  Widespread formation of alternative 3′ UTR isoforms via transcription termination in archaea , 2016, Nature Microbiology.

[33]  Manolis Kellis,et al.  Improved Identification and Analysis of Small Open Reading Frame Encoded Polypeptides. , 2016, Analytical chemistry.

[34]  Ryan T Fellers,et al.  Advancing Top-down Analysis of the Human Proteome Using a Benchtop Quadrupole-Orbitrap Mass Spectrometer. , 2017, Journal of proteome research.