Fine-mapping cellular QTLs with RASQUAL and ATAC-seq

When cellular traits are measured using high-throughput DNA sequencing, quantitative trait loci (QTLs) manifest as fragment count differences between individuals and allelic differences within individuals. We present RASQUAL (Robust Allele-Specific Quantitation and Quality Control), a new statistical approach for association mapping that models genetic effects and accounts for biases in sequencing data using a single, probabilistic framework. RASQUAL substantially improves fine-mapping accuracy and sensitivity relative to existing methods in RNA-seq, DNase-seq and ChIP-seq data. We illustrate how RASQUAL can be used to maximize association detection by generating the first map of chromatin accessibility QTLs (caQTLs) in a European population using ATAC-seq. Despite a modest sample size, we identified 2,707 independent caQTLs (at a false discovery rate of 10%) and demonstrated how RASQUAL and ATAC-seq can provide powerful information for fine-mapping gene-regulatory variants and for linking distal regulatory elements with gene promoters. Our results highlight how combining between-individual and allele-specific genetic signals improves the functional interpretation of noncoding variation.

[1]  D. Clayton,et al.  Genome-wide analysis of allelic expression imbalance in human primary cells by high-throughput transcriptome resequencing , 2009, Human molecular genetics.

[2]  Jonathan K. Pritchard,et al.  Identification of Genetic Variants That Affect Histone Modifications in Human Cells , 2013, Science.

[3]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[4]  P. Donnelly,et al.  A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies , 2009, PLoS genetics.

[5]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[6]  Paolo Vineis,et al.  Genome-wide Association Study Identifies Multiple Risk Loci for Chronic Lymphocytic Leukemia , 2013, Nature Genetics.

[7]  Tomas Babak,et al.  Critical Evaluation of Imprinted Gene Expression by RNA–Seq: A New Perspective , 2012, PLoS genetics.

[8]  S. Prabhakar,et al.  Sensitive detection of chromatin-altering polymorphisms reveals autoimmune disease mechanisms , 2015, Nature Methods.

[9]  David Haig,et al.  Sex-Specific Parent-of-Origin Allelic Expression in the Mouse Brain , 2010, Science.

[10]  John C. Marioni,et al.  Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data , 2009, Bioinform..

[11]  E. Dermitzakis,et al.  Genotype-Based Test in Mapping Cis-Regulatory Variants from Allele-Specific Expression Data , 2012, PloS one.

[12]  Paz Polak,et al.  Genetic Variation in Human DNA Replication Timing , 2014, Cell.

[13]  R. Guigó,et al.  Transcriptome genetics using second generation sequencing in a Caucasian population , 2010, Nature.

[14]  Pedro G. Ferreira,et al.  Transcriptome and genome sequencing uncovers functional variation in humans , 2013, Nature.

[15]  Jun S. Liu,et al.  Genetics of rheumatoid arthritis contributes to biology and drug discovery , 2013 .

[16]  Leighton J. Core,et al.  Coordinated Effects of Sequence Variation on DNA Binding, Chromatin Structure, and Transcription , 2013, Science.

[17]  Sebastian M. Waszak,et al.  Identification and removal of low-complexity sites in allele-specific analysis of ChIP-seq data , 2014, Bioinform..

[18]  Emily K. Tsang,et al.  The landscape of genomic imprinting across diverse adult human tissues , 2015, Genome research.

[19]  M. Pazin,et al.  An enhancer deletion affects both H19 and Igf2 expression. , 1995, Genes & development.

[20]  Joseph K. Pickrell,et al.  DNaseI sensitivity QTLs are a major determinant of human expression variation , 2011, Nature.

[21]  Emmanouil T. Dermitzakis,et al.  Putative cis-regulatory drivers in colorectal cancer , 2014, Nature.

[22]  E. Birney,et al.  Heritable Individual-Specific and Allele-Specific Chromatin Signatures in Humans , 2010, Science.

[23]  John D. Blischak,et al.  Methylation QTLs Are Associated with Coordinated Changes in Transcription Factor Binding, Histone Modifications, and Gene Expression Levels , 2014, bioRxiv.

[24]  Dan Xie,et al.  Extensive Variation in Chromatin States Across Humans , 2013, Science.

[25]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[26]  Wei Sun,et al.  A Statistical Framework for eQTL Mapping Using RNA‐seq Data , 2012, Biometrics.

[27]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[28]  S. Luo,et al.  High-Resolution Analysis of Parent-of-Origin Allelic Expression in the Mouse Brain , 2010, Science.

[29]  Cole Trapnell,et al.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions , 2013, Genome Biology.

[30]  Joseph K. Pickrell,et al.  False positive peaks in ChIP-seq and other sequencing-based functional assays caused by unannotated high copy number regions , 2011, Bioinform..

[31]  Morris Laster,et al.  Characterization of human and mouse H19 regulatory sequences , 2000, Molecular Biology Reports.

[32]  Sander W. Timmer,et al.  Quantitative Genetics of CTCF Binding Reveal Local Sequence Effects and Different Modes of X-Chromosome Association , 2014, PLoS genetics.

[33]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[34]  M. Gerstein,et al.  Variation in Transcription Factor Binding Among Humans , 2010, Science.

[35]  Kate B. Cook,et al.  Determination and Inference of Eukaryotic Transcription Factor Sequence Specificity , 2014, Cell.

[36]  T. Pastinen Genome-wide allele-specific analysis: insights into regulatory variation , 2010, Nature Reviews Genetics.

[37]  Joseph K. Pickrell,et al.  Understanding mechanisms underlying human gene expression variation with RNA sequencing , 2010, Nature.

[38]  Howard Y. Chang,et al.  Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position , 2013, Nature Methods.

[39]  Maria Gutierrez-Arcelus,et al.  Allelic mapping bias in RNA-sequencing is not a major confounder in eQTL studies , 2014, Genome Biology.

[40]  Andrey A. Shabalin,et al.  Matrix eQTL: ultra fast eQTL analysis via large matrix operations , 2011, Bioinform..

[41]  Stanley F. Nelson,et al.  Identification of allele-specific alternative mRNA processing via transcriptome sequencing , 2012, Nucleic acids research.

[42]  Konrad Scheffler,et al.  Gene expression Maximum likelihood inference of imprinting and allele-specific expression from EST data , 2006 .

[43]  Emily K. Tsang,et al.  Genetic conflict reflected in tissue-specific maps of genomic imprinting in human and mouse , 2015, Nature Genetics.

[44]  D. Koller,et al.  Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals , 2013, Genome research.