Genomic, Proteomic and Phenotypic Heterogeneity in HeLa Cells across Laboratories: Implications for Reproducibility of Research Results

The independent reproduction of research results is a cornerstone of experimental research, yet it is beset by numerous challenges, including the quality and veracity of reagents and materials. Much of life science research depends on life materials, including human tissue culture cells. In this study we aimed at determining the degree of variability in the molecular makeup and the ensuing phenotypic consequences in commonly used human tissue culture cells. We collected 14 stock HeLa aliquots from 13 different laboratories across the globe, cultured them in uniform conditions and profiled the genome-wide copy numbers, mRNAs, proteins and protein turnover rates via genomic techniques and SWATH mass spectrometry, respectively. We also phenotyped each cell line with respect to the ability of transfected Let7 mimics to modulate Salmonella infection. We discovered significant heterogeneity between HeLa variants, especially between lines of the CCL2 and Kyoto variety. We also observed progressive divergence within a specific cell line over 50 successive passages. From the aggregate multi-omic datasets we quantified the response of the cells to genomic variability across the transcriptome and proteome. We discovered organelle-specific proteome remodeling and buffering of protein abundance by protein complex stoichiometry, mediated by the adaptation of protein turnover rates. By associating quantitative proteotype and phenotype measurements we identified protein patterns that explained the varying response of the different cell lines to Salmonella infection. Altogether the results indicate a striking degree of genomic variability, the rapid evolution of genomic variability in culture and its complex translation into distinctive expressed molecular and phenotypic patterns. The results have broad implications for the interpretation and reproducibility of research results obtained from HeLa cells and provide important basis for a general discussion of the value and requirements for communicating research results obtained from human tissue culture cells.

[1]  Emma Lundberg,et al.  A Protein Epitope Signature Tag (PrEST) Library Allows SILAC-based Absolute Quantification and Multiplexed Determination of Protein Copy Numbers in Cell Lines* , 2011, Molecular & Cellular Proteomics.

[2]  Carol W. Greider,et al.  Identification of a specific telomere terminal transferase activity in tetrahymena extracts , 1985, Cell.

[3]  Stephen G Oliver,et al.  Dynamics of Protein Turnover, a Missing Dimension in Proteomics* , 2002, Molecular & Cellular Proteomics.

[4]  Michael A Newton,et al.  Dosage compensation can buffer copy-number variation in wild yeast , 2015, eLife.

[5]  C. Begley,et al.  Reproducibility: Six red flags for suspect work , 2013, Nature.

[6]  Eric W. Deutsch,et al.  A repository of assays to quantify 10,000 human proteins by SWATH-MS , 2014, Scientific Data.

[7]  David Henriques,et al.  Systems Pharmacology Dissection of Cholesterol Regulation Reveals Determinants of Large Pharmacodynamic Variability between Cell Lines , 2017, Cell systems.

[8]  Mei Zhao,et al.  Assembly and Initial Characterization of a Panel of 85 Genomically Validated Cell Lines from Diverse Head and Neck Tumor Sites , 2011, Clinical Cancer Research.

[9]  E. Marcotte,et al.  Insights into the regulation of protein abundance from proteomic and transcriptomic analyses , 2012, Nature Reviews Genetics.

[10]  M. Askarian-Amiri,et al.  Evidence for the Existence of Triple-Negative Variants in the MCF-7 Breast Cancer Cell Population , 2014, BioMed research international.

[11]  C. Begley,et al.  Drug development: Raise standards for preclinical cancer research , 2012, Nature.

[12]  M. Selbach,et al.  Global quantification of mammalian gene expression control , 2011, Nature.

[13]  Ruedi Aebersold,et al.  Estimation of Absolute Protein Quantities of Unlabeled Samples by Selected Reaction Monitoring Mass Spectrometry , 2011, Molecular & Cellular Proteomics.

[14]  S. Morrison,et al.  Time to do something about reproducibility , 2014, eLife.

[15]  Ruedi Aebersold,et al.  Quantitative variability of 342 plasma proteins in a human twin population , 2015 .

[16]  David Lindenmayer,et al.  A subcellular map of the human proteome , 2017, Science.

[17]  F. Prinz,et al.  Believe it or not: how much can we rely on published data on potential drug targets? , 2011, Nature Reviews Drug Discovery.

[18]  Mauricio O. Carneiro,et al.  From FastQ Data to High‐Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline , 2013, Current protocols in bioinformatics.

[19]  Timothy Turner,et al.  Development of the Polio Vaccine: A Historical Perspective of Tuskegee University’s Role in Mass Production and Distribution of HeLa Cells , 2012, Journal of health care for the poor and underserved.

[20]  R. Baker,et al.  HeLa cell variants that differ in sensitivity to monofunctional alkylating agents, with independence of cytotoxic and mutagenic responses. , 1979, Proceedings of the National Academy of Sciences of the United States of America.

[21]  G. Krissansen,et al.  MCF-7 breast cancer cells selected for tamoxifen resistance acquire new phenotypes differing in DNA content, phospho-HER2 and PAX2 expression, and rapamycin sensitivity , 2010, Cancer biology & therapy.

[22]  Paul Theodor Pyl,et al.  The Genomic and Transcriptomic Landscape of a HeLa Cell Line , 2013, G3: Genes, Genomes, Genetics.

[23]  Wolfgang Mayer,et al.  Structure and transcription of human papillomavirus sequences in cervical carcinoma cells , 1985, Nature.

[24]  Walter Kolch,et al.  A novel RNA sequencing data analysis method for cell line authentication , 2017, PloS one.

[25]  D. Bartel,et al.  Predicting effective microRNA target sites in mammalian mRNAs , 2015, eLife.

[26]  M. Griffin,et al.  ENZYMATIC AND CHROMOSOMAL CHARACTERIZATION OF HELA VARIANTS , 1969, The Journal of cell biology.

[27]  Hans-Werner Mewes,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[28]  Emanuel J. V. Gonçalves,et al.  A Landscape of Pharmacogenomic Interactions in Cancer , 2016, Cell.

[29]  S. Gygi,et al.  Quantitative proteomic analysis reveals posttranslational responses to aneuploidy in yeast , 2014, eLife.

[30]  W. Heneen HeLa cells and their possible contamination of other cell lines: karyotype studies. , 2009, Hereditas.

[31]  S. O’Brien,et al.  Characteristics of HeLa strains: permanent vs. variable features. , 1980, Cytogenetics and cell genetics.

[32]  Jay Shendure,et al.  The haplotype-resolved genome and epigenome of the aneuploid HeLa cancer cell line , 2013, Nature.

[33]  Ben C. Collins,et al.  A tool for the automated, targeted analysis of data-independent acquisition MS-data: OpenSWATH , 2014 .

[34]  D. Curran‐Everett,et al.  The fickle P value generates irreproducible results , 2015, Nature Methods.

[35]  M. Baker 1,500 scientists lift the lid on reproducibility , 2016, Nature.

[36]  B. Fuchs,et al.  Genomic Instability of Osteosarcoma Cell Lines in Culture: Impact on the Prediction of Metastasis Relevant Genes , 2015, PloS one.

[37]  Yao-Cheng Lin,et al.  Genome dynamics of the human embryonic kidney 293 lineage in response to cell biology manipulations , 2014, Nature Communications.

[38]  A. Plant,et al.  Standards for Cell Line Authentication and Beyond , 2016, PLoS biology.

[39]  Nuno A. Fonseca,et al.  Transcription Factor Activities Enhance Markers of Drug Sensitivity in Cancer. , 2018, Cancer research.

[40]  Christopher D. Chambers,et al.  Redefine statistical significance , 2017, Nature Human Behaviour.

[41]  Begley Cg,et al.  Ocean science: Arctic sea ice needs better forecasts , 2013, Nature.

[42]  P. Rämö,et al.  Autophagy Proteins Promote Repair of Endosomal Membranes Damaged by the Salmonella Type Three Secretion System 1. , 2015, Cell host & microbe.

[43]  N. Rhind,et al.  Signaling pathways that regulate cell division. , 2012, Cold Spring Harbor perspectives in biology.

[44]  Ruedi Aebersold,et al.  Mass-spectrometric exploration of proteome structure and function , 2016, Nature.

[45]  R. Aebersold,et al.  On the Dependency of Cellular Protein Levels on mRNA Abundance , 2016, Cell.

[46]  Damian Szklarczyk,et al.  The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible , 2016, Nucleic Acids Res..

[47]  Monya Baker,et al.  Cancer reproducibility project releases first results , 2017, Nature.

[48]  C. Osborne,et al.  Biological differences among MCF-7 human breast cancer cell lines from different laboratories , 2005, Breast Cancer Research and Treatment.

[49]  M. Mann,et al.  Defining the transcriptome and proteome in three functionally different human cell lines , 2010, Molecular systems biology.

[50]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[51]  Francis S. Collins,et al.  Fixing problems with cell lines , 2014, Science.

[52]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[53]  Ludovic C. Gillet,et al.  Quantitative measurements of N‐linked glycoproteins in human plasma by SWATH‐MS , 2013, Proteomics.

[54]  E. Schröck,et al.  Comprehensive and definitive molecular cytogenetic characterization of HeLa cells by spectral karyotyping. , 1999, Cancer research.

[55]  Michael Springer,et al.  No current evidence for widespread dosage compensation in S. cerevisiae , 2016, eLife.

[56]  Hideaki Sugawara,et al.  The Sequence Read Archive , 2010, Nucleic Acids Res..

[57]  Natalie I. Tasman,et al.  iProphet: Multi-level Integrative Analysis of Shotgun Proteomic Data Improves Peptide and Protein Identification Rates and Error Estimates* , 2011, Molecular & Cellular Proteomics.

[58]  Evan G. Williams,et al.  Systems proteomics of liver mitochondria function , 2016, Science.

[59]  Evan G. Williams,et al.  Systematic proteome and proteostasis profiling in human Trisomy 21 fibroblast cells , 2017, Nature Communications.

[60]  Nichole L. King,et al.  Development and validation of a spectral library searching method for peptide identification from MS/MS , 2007, Proteomics.

[61]  Joshua S. Kaminker,et al.  A resource for cell line authentication, annotation and quality control , 2015, Nature.

[62]  Christine A. Sedore,et al.  Impact of genetic background and experimental reproducibility on identifying chemical compounds with robust longevity effects , 2017, Nature Communications.

[63]  Brian A. Nosek,et al.  Promoting an open research culture , 2015, Science.

[64]  D. Gems,et al.  No increase in lifespan in Caenorhabditis elegans upon treatment with the superoxide dismutase mimetic EUK-8. , 2003, Free radical biology & medicine.

[65]  R. Aebersold,et al.  Mass spectrometry-based proteomics , 2003, Nature.

[66]  Brett Larsen,et al.  Multi-laboratory assessment of reproducibility, qualitative and quantitative performance of SWATH-mass spectrometry , 2016, bioRxiv.

[67]  Karl Mechtler,et al.  Transcriptome and proteome quantification of a tumor model provides novel insights into post‐transcriptional gene regulation , 2013, Genome Biology.

[68]  A. Hyman,et al.  Stem cells: the new “model organism” , 2017, Molecular biology of the cell.

[69]  Greg W. Clark,et al.  Panorama of ancient metazoan macromolecular complexes , 2015, Nature.

[70]  John P. A. Ioannidis,et al.  A manifesto for reproducible science , 2017, Nature Human Behaviour.

[71]  R. Beynon,et al.  Proteome Dynamics: Revisiting Turnover with a Global Perspective* , 2012, Molecular & Cellular Proteomics.

[72]  Maxwell R. Mumbach,et al.  Dynamic profiling of the protein life cycle in response to pathogens , 2015, Science.

[73]  Lars Malmström,et al.  TRIC: an automated alignment strategy for reproducible protein quantification in targeted proteomics , 2016, Nature Methods.

[74]  Ludovic C. Gillet,et al.  Mass Spectrometry Applied to Bottom-Up Proteomics: Entering the High-Throughput Era for Hypothesis Testing. , 2016, Annual review of analytical chemistry.

[75]  Leon N. Schulte,et al.  Analysis of the host microRNA response to Salmonella uncovers the control of major cytokines by the let‐7 family , 2011, The EMBO journal.

[76]  Ludovic C. Gillet,et al.  Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis* , 2012, Molecular & Cellular Proteomics.

[77]  Christian von Mering,et al.  Rnai Screen of Salmonella Invasion Shows Role of Copi in Membrane Targeting of Cholesterol and Cdc42 , 2022 .

[78]  N. Rajewsky,et al.  Widespread changes in protein synthesis induced by microRNAs , 2008, Nature.

[79]  S. Melov,et al.  Extension of life-span with superoxide dismutase/catalase mimetics. , 2000, Science.

[80]  G. Hutchins,et al.  Henrietta Lacks, HeLa cells, and cell culture contamination. , 2009, Archives of pathology & laboratory medicine.

[81]  V. Ambros The functions of animal microRNAs , 2004, Nature.

[82]  L. Gribaldo,et al.  High variability of genomic instability and gene expression profiling in different HeLa clones , 2015, Scientific Reports.

[83]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[84]  Monica Driscoll,et al.  A long journey to reproducible results , 2017, Nature.

[85]  A. K. Criss,et al.  Coordinate Regulation of Salmonella enterica Serovar Typhimurium Invasion of Epithelial Cells by the Arp2/3 Complex and Rho GTPases , 2003, Infection and Immunity.

[86]  M. Mann,et al.  Global analysis of genome, transcriptome and proteome reveals the response to aneuploidy in human cells , 2012, Molecular Systems Biology.

[87]  B. Kuster,et al.  Mass-spectrometry-based draft of the human proteome , 2014, Nature.

[88]  Patrick Jenny,et al.  Near Surface Swimming of Salmonella Typhimurium Explains Target-Site Selection and Cooperative Invasion , 2012, PLoS pathogens.

[89]  A. Helenius,et al.  Endocytosis of viruses and bacteria. , 2014, Cold Spring Harbor perspectives in biology.

[90]  P. Pavlidis,et al.  Can we predict protein from mRNA levels? , 2017, Nature.

[91]  Amanda Capes-Davis,et al.  Check your cultures! A list of cross‐contaminated or misidentified cell lines , 2010, International journal of cancer.

[92]  The challenges of replication , 2017, eLife.

[93]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[94]  F. Slack,et al.  The let-7 family of microRNAs. , 2008, Trends in cell biology.

[95]  Michelle S. Scott,et al.  A Quantitative Spatial Proteomics Analysis of Proteome Turnover in Human Cells* , 2011, Molecular & Cellular Proteomics.

[96]  Ludovic C. Gillet,et al.  Quantifying protein interaction dynamics by SWATH mass spectrometry: application to the 14-3-3 system , 2013, Nature Methods.

[97]  S. Fullerton,et al.  Genomics is failing on diversity , 2016, Nature.

[98]  Lorenz Blum,et al.  Improving the Swiss Grid Proteomics Portal: Requirements and new Features based on Experience and Usability Considerations , 2013, IWSG.

[99]  Konrad U. Förstner,et al.  Functional high-throughput screening identifies the miR-15 microRNA family as cellular restriction factors for Salmonella infection , 2014, Nature Communications.

[100]  E. S. Venkatraman,et al.  A faster circular binary segmentation algorithm for the analysis of array CGH data , 2007, Bioinform..

[101]  Monya Baker,et al.  How quality control could save your science , 2016, Nature.

[102]  Ruedi Aebersold,et al.  Using data‐independent, high‐resolution mass spectrometry in protein biomarker research: Perspectives and clinical applications , 2015, Proteomics. Clinical applications.