Single-cell RNA-seq variant analysis for exploration of genetic heterogeneity in cancer

Inter- and intra-tumour heterogeneity is caused by genetic and non-genetic factors, leading to severe clinical implications. High-throughput sequencing technologies provide unprecedented tools to analyse DNA and RNA in single cells and explore both genetic heterogeneity and phenotypic variation between cells in tissues and tumours. Simultaneous analysis of both DNA and RNA in the same cell is, however, still in its infancy. We have thus developed a method to extract and analyse information regarding genetic heterogeneity that affects cellular biology from single-cell RNA-seq data. The method enables both comparisons and clustering of cells based on genetic variation in single nucleotide variants, revealing cellular subpopulations corroborated by gene expression-based methods. Furthermore, the results show that lymph node metastases have lower levels of genetic heterogeneity compared to their original tumours with respect to variants affecting protein function. The analysis also revealed three previously unknown variants common across cancer cells in glioblastoma patients. These results demonstrate the power and versatility of scRNA-seq variant analysis and highlight it as a useful complement to already existing methods, enabling simultaneous investigations of both gene expression and genetic variation.

[1]  L. J. K. Wee,et al.  Reference component analysis of single-cell transcriptomes elucidates cellular heterogeneity in human colorectal tumors , 2017, Nature Genetics.

[2]  Mingming Jia,et al.  COSMIC: exploring the world's knowledge of somatic mutations in human cancer , 2014, Nucleic Acids Res..

[3]  Patrik L. Ståhl,et al.  Visualization and analysis of gene expression in tissue sections by spatial transcriptomics , 2016, Science.

[4]  A. Jemal,et al.  Cancer statistics, 2018 , 2018, CA: a cancer journal for clinicians.

[5]  Carol J. Saunders,et al.  Biallelic Mutations in TBCD, Encoding the Tubulin Folding Cofactor D, Perturb Microtubule Dynamics and Cause Early-Onset Encephalopathy. , 2016, American journal of human genetics.

[6]  Carlo C. Maley,et al.  Clonal evolution in cancer , 2012, Nature.

[7]  Charles Swanton,et al.  Tumour heterogeneity and the evolution of polyclonal drug resistance , 2014, Molecular oncology.

[8]  G. Turashvili,et al.  Tumor Heterogeneity in Breast Cancer , 2017, Front. Med..

[9]  D. Hanahan,et al.  Hallmarks of Cancer: The Next Generation , 2011, Cell.

[10]  D. Sabatini,et al.  mTOR Signaling in Growth, Metabolism, and Disease , 2017, Cell.

[11]  Pablo Cingolani,et al.  © 2012 Landes Bioscience. Do not distribute. , 2022 .

[12]  Y. Shyr,et al.  Practicability of detecting somatic point mutation from RNA high throughput sequencing data. , 2016, Genomics.

[13]  L. Pusztai,et al.  Cancer heterogeneity: implications for targeted therapeutics , 2013, British Journal of Cancer.

[14]  F. Conti,et al.  Localization of the Na+-coupled neutral amino acid transporter 2 in the cerebral cortex , 2006, Neuroscience.

[15]  D. Kell,et al.  Correlative Light-Electron Microscopy detects lipopolysaccharide and its association with fibrin fibres in Parkinson’s Disease, Alzheimer’s Disease and Type 2 Diabetes Mellitus , 2018, Scientific Reports.

[16]  C. Szigyarto,et al.  seqCAT: a Bioconductor R-package for variant analysis of high throughput sequencing data , 2018, F1000Research.

[17]  David P. Kreil,et al.  Assessing technical performance in differential gene expression experiments with external spike-in RNA control ratio mixtures , 2014, Nature Communications.

[18]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[19]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[20]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[21]  C. Al-Khalili Szigyarto,et al.  Analysis of public RNA-sequencing data reveals biological consequences of genetic heterogeneity in cell line populations , 2018, Scientific Reports.

[22]  Charles J. Vaske,et al.  Single-cell analyses of transcriptional heterogeneity during drug tolerance transition in cancer cells by RNA sequencing , 2014, Proceedings of the National Academy of Sciences.

[23]  Xun Zhu,et al.  Using Single Nucleotide Variations in Cancer Single-Cell RNA-Seq Data for Subpopulation Identification and Genotype-phenotype Linkage Analysis , 2016 .

[24]  V. P. Collins,et al.  Intratumor heterogeneity in human glioblastoma reflects cancer evolutionary dynamics , 2013, Proceedings of the National Academy of Sciences.

[25]  Jingyuan Fu,et al.  Calling genotypes from public RNA-sequencing data enables identification of genetic variants that affect gene-expression levels , 2014, Genome Medicine.

[26]  Walter Kolch,et al.  A novel RNA sequencing data analysis method for cell line authentication , 2017, PloS one.

[27]  J. Seoane,et al.  Glioblastoma Multiforme: A Look Inside Its Heterogeneous Nature , 2014, Cancers.

[28]  Siddharth S. Dey,et al.  Integrated genome and transcriptome sequencing from the same cell , 2014, Nature Biotechnology.

[29]  P. Swain,et al.  Stochastic Gene Expression in a Single Cell , 2002, Science.

[30]  Jin Billy Li,et al.  Reliable identification of genomic variants from RNA-seq data. , 2013, American journal of human genetics.

[31]  C. Lindskog,et al.  A pathology atlas of the human cancer transcriptome , 2017, Science.

[32]  M. Wong,et al.  A novel strategy for clustering major depression individuals using whole-genome sequencing variant data , 2017, Scientific Reports.

[33]  Chun Jimmie Ye,et al.  Multiplexed droplet single-cell RNA-sequencing using natural genetic variation , 2017, Nature Biotechnology.

[34]  Deric M. Park,et al.  The Evidence of Glioblastoma Heterogeneity , 2015, Scientific Reports.

[35]  Manuel Arruebo,et al.  Assessment of the Evolution of Cancer Treatment Therapies , 2011, Cancers.

[36]  Guangchuang Yu,et al.  clusterProfiler: an R package for comparing biological themes among gene clusters. , 2012, Omics : a journal of integrative biology.

[37]  Michael P. Schroeder,et al.  IntOGen-mutations identifies cancer drivers across tumor types , 2013, Nature Methods.

[38]  Xun Zhu,et al.  Using Single Nucleotide Variations in Single-Cell RNA-Seq to Identify Tumor Subpopulations and Genotype-phenotype Linkage , 2016 .

[39]  T. Asano,et al.  A hepatic amino acid/mTOR/S6K-dependent signalling pathway modulates systemic lipid metabolism via neuronal signals , 2015, Nature Communications.

[40]  J. Bell,et al.  Structure and function of the human MHC class Ib molecules HLA‐E, HLA‐F and HLA‐G , 1998, Immunological reviews.

[41]  Jonathan Kans,et al.  Entrez Direct: E-utilities on the UNIX Command Line , 2016 .

[42]  Jeong Eon Lee,et al.  Single-cell RNA-seq enables comprehensive tumour and immune cell profiling in primary breast cancer , 2017, Nature Communications.

[43]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information , 2007, Nucleic Acids Res..

[44]  Nicholas Navin,et al.  Tumor evolution: Linear, branching, neutral or punctuated? , 2017, Biochimica et biophysica acta. Reviews on cancer.

[45]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[46]  C. Ponting,et al.  G&T-seq: parallel sequencing of single-cell genomes and transcriptomes , 2015, Nature Methods.

[47]  Steven D Chang,et al.  Single-Cell RNAseq analysis of infiltrating neoplastic cells at the migrating front of human glioblastoma , 2017, bioRxiv.

[48]  V. Ganapathy,et al.  Primary structure, functional characteristics and tissue expression pattern of human ATA2, a subtype of amino acid transport system A. , 2000, Biochimica et biophysica acta.

[49]  S. Megason,et al.  RNA-seq–based mapping and candidate identification of mutations from forward genetic screens , 2013, Genome research.

[50]  K. Ashkan,et al.  Identification and functional prediction of mitochondrial complex III and IV mutations associated with glioblastoma , 2015, Neuro-oncology.

[51]  R. Lister,et al.  Highly Integrated Single-Base Resolution Maps of the Epigenome in Arabidopsis , 2008, Cell.

[52]  David P. Kreil,et al.  A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control consortium , 2014, Nature Biotechnology.

[53]  J C Zabala,et al.  Tubulin folding cofactor D is a microtubule destabilizing protein , 2000, FEBS letters.

[54]  M. Gerstein,et al.  The Transcriptional Landscape of the Yeast Genome Defined by RNA Sequencing , 2008, Science.

[55]  Rob Patro,et al.  Salmon provides fast and bias-aware quantification of transcript expression , 2017, Nature Methods.

[56]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[57]  Xun Zhu,et al.  Using single-cell multiple omics approaches to resolve tumor heterogeneity , 2017, Clinical and Translational Medicine.

[58]  David L. Marron,et al.  Integrated RNA and DNA sequencing reveals early drivers of metastatic breast cancer , 2018, The Journal of clinical investigation.

[59]  S. Koushika,et al.  Neurodegeneration and microtubule dynamics: death by a thousand cuts , 2015, Front. Cell. Neurosci..

[60]  S. Linnarsson,et al.  Exome sequencing of primary breast cancers with paired metastatic lesions reveals metastasis-enriched mutations in the A-kinase anchoring protein family (AKAPs) , 2018, BMC Cancer.

[61]  S. Natsugoe,et al.  Human leukocyte antigen (HLA)-E and HLA-F expression in gastric cancer. , 2015, Anticancer research.

[62]  Sean R. Davis,et al.  SRAdb: query and use public next-generation sequencing data from within R , 2013, BMC Bioinformatics.

[63]  Sean R. Davis,et al.  GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor , 2007, Bioinform..

[64]  P. A. Biro,et al.  HLA-F Is a Predominantly Empty, Intracellular, TAP-Associated MHC Class Ib Protein with a Restricted Expression Pattern1 , 2000, The Journal of Immunology.

[65]  O. Elpeleg,et al.  Infantile neurodegenerative disorder associated with mutations in TBCD, an essential gene in the tubulin heterodimer assembly pathway. , 2016, Human molecular genetics.

[66]  P. V. van Diest,et al.  Oncogenic KRAS desensitizes colorectal tumor cells to epidermal growth factor receptor inhibition and activation. , 2010, Neoplasia.

[67]  David M. Sabatini,et al.  mTOR Signaling in Growth, Metabolism, and Disease , 2017, Cell.