A scalable strategy for high-throughput GFP tagging of endogenous human proteins

Significance The function of a large fraction of the human proteome still remains poorly characterized. Tagging proteins with a functional sequence is a powerful way to access function, and inserting tags at endogenous genomic loci allows the preservation of a near-native cellular background. To characterize the cellular role of human proteins in a systematic manner and in a native context, we developed a method for tagging endogenous human proteins with GFP that is both rapid and readily applicable at a genome-wide scale. Our approach allows studying both localization and interaction partners of the protein target. Our results pave the way for the large-scale generation of endogenously tagged human cell lines for a systematic functional interrogation of the human proteome. A central challenge of the postgenomic era is to comprehensively characterize the cellular role of the ∼20,000 proteins encoded in the human genome. To systematically study protein function in a native cellular background, libraries of human cell lines expressing proteins tagged with a functional sequence at their endogenous loci would be very valuable. Here, using electroporation of Cas9 nuclease/single-guide RNA ribonucleoproteins and taking advantage of a split-GFP system, we describe a scalable method for the robust, scarless, and specific tagging of endogenous human genes with GFP. Our approach requires no molecular cloning and allows a large number of cell lines to be processed in parallel. We demonstrate the scalability of our method by targeting 48 human genes and show that the resulting GFP fluorescence correlates with protein expression levels. We next present how our protocols can be easily adapted for the tagging of a given target with GFP repeats, critically enabling the study of low-abundance proteins. Finally, we show that our GFP tagging approach allows the biochemical isolation of native protein complexes for proteomic studies. Taken together, our results pave the way for the large-scale generation of endogenously tagged human cell lines for the proteome-wide analysis of protein localization and interaction networks in a native cellular context.

[1]  O. Ozier-Kalogeropoulos,et al.  A simple and efficient method for direct gene deletion in Saccharomyces cerevisiae. , 1993, Nucleic acids research.

[2]  D. G. Wang,et al.  Solid-phase reversible immobilization for the isolation of PCR products. , 1995, Nucleic acids research.

[3]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[4]  E. O’Shea,et al.  Global analysis of protein expression in yeast , 2003, Nature.

[5]  E. O’Shea,et al.  Global analysis of protein localization in budding yeast , 2003, Nature.

[6]  J. Bonfield,et al.  Finishing the euchromatic sequence of the human genome , 2004, Nature.

[7]  J. Bonfield,et al.  Finishing the euchromatic sequence of the human genome , 2004, Nature.

[8]  T. Terwilliger,et al.  Protein tagging and detection with engineered self-assembling fragments of green fluorescent protein , 2005, Nature Biotechnology.

[9]  Eric Gouaux,et al.  Fluorescence-detection size-exclusion chromatography for precrystallization screening of integral membrane proteins. , 2006, Structure.

[10]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[11]  J. Peters,et al.  The cohesin complex and its roles in chromosome biology. , 2008, Genes & development.

[12]  S. Boxer,et al.  Deconstructing green fluorescent protein. , 2008, Journal of the American Chemical Society.

[13]  V. de Crécy-Lagard,et al.  'Unknown' proteins and 'orphan' enzymes: the missing half of the engineering parts list--and how to find it. , 2009, The Biochemical journal.

[14]  Tatsuo Fukagawa,et al.  An auxin-based degron system for the rapid depletion of proteins in nonplant cells , 2009, Nature Methods.

[15]  J. Weissman,et al.  Membranes in balance: mechanisms of sphingolipid homeostasis. , 2010, Molecular cell.

[16]  A. Hyman,et al.  Quantitative proteomics combined with BAC TransgeneOmics reveals in vivo protein interactions , 2010, The Journal of cell biology.

[17]  Shondra M Pruett-Miller,et al.  High-frequency genome editing using ssDNA oligonucleotides with zinc-finger nucleases , 2011, Nature Methods.

[18]  F. Brodsky,et al.  Diversity of clathrin function: new tricks for an old protein. , 2012, Annual review of cell and developmental biology.

[19]  J. Doudna,et al.  A Programmable Dual-RNA–Guided DNA Endonuclease in Adaptive Bacterial Immunity , 2012, Science.

[20]  T. Rapoport,et al.  Mechanisms of Sec61/SecY-mediated protein translocation across membranes. , 2012, Annual review of biophysics.

[21]  Michel Sadelain,et al.  Safe harbours for the integration of new DNA in the human genome , 2011, Nature Reviews Cancer.

[22]  Anna M. McGeachy,et al.  The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments , 2012, Nature Protocols.

[23]  R. Jaenisch,et al.  One-Step Generation of Mice Carrying Reporter and Conditional Alleles by CRISPR/Cas-Mediated Genome Engineering , 2013, Cell.

[24]  J. Weissman,et al.  The contribution of systematic approaches to characterizing the proteins and functions of the endoplasmic reticulum. , 2013, Cold Spring Harbor perspectives in biology.

[25]  Le Cong,et al.  Multiplex Genome Engineering Using CRISPR/Cas Systems , 2013, Science.

[26]  Jennifer Doudna,et al.  RNA-programmed genome editing in human cells , 2013, eLife.

[27]  Daesik Kim,et al.  Highly efficient RNA-guided genome editing in human cells via delivery of purified Cas9 ribonucleoproteins , 2014, Genome research.

[28]  M. Schuldiner,et al.  The emergence of proteome-wide technologies: systematic analysis of proteins comes of age , 2014, Nature Reviews Molecular Cell Biology.

[29]  Steven Lin,et al.  Enhanced homology-directed human genome engineering by controlled timing of CRISPR/Cas9 delivery , 2014, eLife.

[30]  Meagan E. Sullender,et al.  Rational design of highly active sgRNAs for CRISPR-Cas9–mediated gene inactivation , 2014, Nature Biotechnology.

[31]  Jonathan S. Weissman,et al.  Principles of ER cotranslational translocation revealed by proximity-specific ribosome profiling , 2014, Science.

[32]  David H Burkhardt,et al.  Quantifying Absolute Protein Synthesis Rates Reveals Principles Underlying Allocation of Cellular Resources , 2014, Cell.

[33]  Sean R. Collins,et al.  Systematic Discovery of Human Gene Function and Principles of Modular Organization through Phylogenetic Profiling. , 2015, Cell reports.

[34]  Charles E. Vejnar,et al.  CRISPRscan: designing highly efficient sgRNAs for CRISPR/Cas9 targeting in vivo , 2015, Nature Methods.

[35]  Yolanda T. Chong,et al.  Yeast Proteome Dynamics from Single Cell Imaging and Automated Analysis , 2015, Cell.

[36]  X. Zhuang,et al.  Spatially resolved, highly multiplexed RNA profiling in single cells , 2015, Science.

[37]  Marco Y. Hein,et al.  A Human Interactome in Three Quantitative Dimensions Organized by Stoichiometries and Abundances , 2015, Cell.

[38]  Luke A. Gilbert,et al.  Versatile protein tagging in cells with split fluorescent protein , 2016, Nature Communications.

[39]  Max A. Horlbeck,et al.  Nucleosomes impede Cas9 access to DNA in vivo and in vitro , 2016, eLife.