Discovery and characterization of variance QTLs in human induced pluripotent stem cells

Quantification of gene expression levels at the single cell level has revealed that gene expression can vary substantially even across a population of homogeneous cells. However, it is currently unclear what genomic features control variation in gene expression levels, and whether common genetic variants may impact gene expression variation. Here, we take a genome-wide approach to identify expression variance quantitative trait loci (vQTLs). To this end, we generated single cell RNA-seq (scRNA-seq) data from induced pluripotent stem cells (iPSCs) derived from 53 Yoruba individuals. We collected data for a median of 95 cells per individual and a total of 5,447 single cells, and identified 241 mean expression QTLs (eQTLs) at 10% FDR, of which 82% replicate in bulk RNA-seq data from the same individuals. We further identified 14 vQTLs at 10% FDR, but demonstrate that these can also be explained as effects on mean expression. Our study suggests that dispersion QTLs (dQTLs) which could alter the variance of expression independently of the mean can have larger fold changes, but explain less phenotypic variance than eQTLs. We estimate 424 individuals as a lower bound to achieve 80% power to detect the strongest dQTLs in iPSCs. These results will guide the design of future studies on understanding the genetic control of gene expression variance. Author summary Common genetic variation can alter the level of average gene expression in human tissues, and through changes in gene expression have downstream consequences on cell function, human development, and human disease. However, human tissues are composed of many cells, each with its own level of gene expression. With advances in single cell sequencing technologies, we can now go beyond simply measuring the average level of gene expression in a tissue sample and directly measure cell-to-cell variance in gene expression. We hypothesized that genetic variation could also alter gene expression variance, potentially revealing new insights into human development and disease. To test this hypothesis, we used single cell RNA sequencing to directly measure gene expression variance in multiple individuals, and then associated the gene expression variance with genetic variation in those same individuals. Our results suggest that effects on gene expression variance are smaller than effects on mean expression, relative to how much the phenotypes vary between individuals, and will require much larger studies than previously thought to detect.

[1]  R. Irizarry,et al.  Missing data and technical variability in single‐cell RNA‐sequencing experiments , 2018, Biostatistics.

[2]  Jingshu Wang,et al.  Gene expression distribution deconvolution in single-cell RNA sequencing , 2017, Proceedings of the National Academy of Sciences.

[3]  Anshul Kundaje,et al.  Supplementary Information for Impact of regulatory variation across human iPSCs and differentiated cells , 2017 .

[4]  Olivier Delaneau,et al.  A complete tool set for molecular QTL discovery and analysis , 2016, Nature Communications.

[5]  David A. Knowles,et al.  Batch effects and the effective design of single-cell gene expression studies , 2016, Scientific Reports.

[6]  A. Heger,et al.  UMI-tools: modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy , 2016, bioRxiv.

[7]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[8]  Matthew Stephens,et al.  False discovery rates: a new deal , 2016, bioRxiv.

[9]  J. François,et al.  Natural Yeast Promoter Variants Reveal Epistasis in the Generation of Transcriptional-Mediated Noise and Its Potential Benefit in Stressful Conditions , 2015, Genome biology and evolution.

[10]  David M. Evans,et al.  Novel Approach Identifies SNPs in SLC2A10 and KCNK9 with Evidence for Parent-of-Origin Effect on Body Mass Index , 2014, PLoS genetics.

[11]  A. Oudenaarden,et al.  Validation of noise models for single-cell transcriptomics , 2014, Nature Methods.

[12]  B. Williams,et al.  From single-cell to cell-pool transcriptomes: Stochasticity in gene expression and RNA splicing , 2014, Genome research.

[13]  R. Milo,et al.  Noise Genetics: Inferring Protein Function by Correlating Phenotype with Protein Levels and Localization in Individual Human Cells , 2014, PLoS genetics.

[14]  Gioele La Manno,et al.  Quantitative single-cell RNA-seq with unique molecular identifiers , 2013, Nature Methods.

[15]  R. Sandberg,et al.  Single-Cell RNA-Seq Reveals Dynamic, Random Monoallelic Gene Expression in Mammalian Cells , 2014, Science.

[16]  Wei Shi,et al.  featureCounts: an efficient general purpose program for assigning sequence reads to genomic features , 2013, Bioinform..

[17]  D. Petrov,et al.  Genomic Evidence of Rapid and Stable Adaptive Oscillations over Seasonal Time Scales in Drosophila , 2013, PLoS genetics.

[18]  Jeffrey E. Barrick,et al.  Genome dynamics during experimental evolution , 2013, Nature Reviews Genetics.

[19]  Jonathan K. Pritchard,et al.  Identification of Genetic Variants That Affect Histone Modifications in Human Cells , 2013, Science.

[20]  Daniel E. Runcie,et al.  The Impact of Gene Expression Variation on the Robustness and Evolvability of a Developmental Gene Regulatory Network , 2013, PLoS biology.

[21]  Francesco Baldini,et al.  The Interaction between a Sexually Transferred Steroid Hormone and a Female Protein Regulates Oogenesis in the Malaria Mosquito Anopheles gambiae , 2013, PLoS biology.

[22]  Quin F. Wills,et al.  Single-cell gene expression analysis reveals genetic associations masked in whole-tissue experiments , 2013, Nature Biotechnology.

[23]  S. Bordenstein,et al.  Mom Knows Best: The Universality of Maternal Microbial Transmission , 2013, PLoS biology.

[24]  W. Shi,et al.  The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote , 2013, Nucleic acids research.

[25]  J. Marioni,et al.  Inferring the kinetics of stochastic gene expression from single-cell RNA-sequencing data , 2013, Genome Biology.

[26]  G. Abecasis,et al.  Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. , 2012, American journal of human genetics.

[27]  Davis J. McCarthy,et al.  Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation , 2012, Nucleic acids research.

[28]  Joseph K. Pickrell,et al.  DNaseI sensitivity QTLs are a major determinant of human expression variation , 2011, Nature.

[29]  Jennifer M. Bolin,et al.  Chemically defined conditions for human iPS cell derivation and culture , 2011, Nature Methods.

[30]  M. Elowitz,et al.  Functional roles for noise in genetic circuits , 2010, Nature.

[31]  C Brandon Ogbunugafor,et al.  On the possible role of robustness in the evolution of infectious diseases , 2010, Chaos.

[32]  M. Siegal,et al.  Robustness: mechanisms and consequences. , 2009, Trends in genetics : TIG.

[33]  Greg Gibson,et al.  Decanalization and the origin of complex disease , 2009, Nature Reviews Genetics.

[34]  Jingyuan Fu,et al.  Genetical Genomics: Spotlight on QTL Hotspots , 2008, PLoS genetics.

[35]  J. François,et al.  Cell-to-Cell Stochastic Variation in Gene Expression Is a Complex Genetic Trait , 2008, PLoS genetics.

[36]  Jeffrey E. Barrick,et al.  Balancing Robustness and Evolvability , 2006, PLoS biology.

[37]  D. Solter,et al.  Space Asymmetry Directs Preferential Sperm Entry in the Absence of Polarity in the Mouse Oocyte , 2006, PLoS biology.

[38]  Michel Loreau,et al.  Functional Diversity of Plant–Pollinator Interaction Webs Enhances the Persistence of Plant Communities , 2005, PLoS biology.

[39]  Baptiste Jaquemet,et al.  The new deal , 2005, SIGGRAPH '05.

[40]  Hiroaki Kitano,et al.  Biological robustness , 2008, Nature Reviews Genetics.

[41]  J. Stelling,et al.  Robustness of Cellular Functions , 2004, Cell.

[42]  J. Raser,et al.  Control of Stochasticity in Eukaryotic Gene Expression , 2004, Science.

[43]  G. Wagner,et al.  EVOLUTION AND DETECTION OF GENETIC ROBUSTNESS , 2003 .

[44]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[45]  J. Gerhart,et al.  Evolvability. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[46]  Clifford B. Cordy,et al.  Deconvolution of a Distribution Function , 1997 .

[47]  Peter K. Dunn,et al.  Randomized Quantile Residuals , 1996 .

[48]  Leon D. Segal,et al.  Functions , 1995 .

[49]  Waddington Ch,et al.  Canalization of Development and Genetic Assimilation of Acquired Characters , 1959 .

[50]  C. H. Waddington,et al.  Evolutionary Systems–Animal and Human , 1959, Nature.

[51]  C. H. WADDINGTON,et al.  Canalization of Development and Genetic Assimilation of Acquired Characters , 1959, Nature.

[52]  robustness of , 2022 .