Challenges and progress in interpretation of non-coding genetic variants associated with human disease

Genome-wide association studies have shown that the far majority of disease-associated variants reside in the non-coding regions of the genome, suggesting that gene regulatory changes contribute to disease risk. To identify truly causal non-coding variants and their affected target genes remains challenging but is a critical step to translate the genetic associations to molecular mechanisms and ultimately clinical applications. Here we review genomic/epigenomic resources and in silico tools that can be used to identify causal non-coding variants and experimental strategies to validate their functionalities. Impact statement Most signals from genome-wide association studies (GWASs) map to the non-coding genome, and functional interpretation of these associations remained challenging. We reviewed recent progress in methodologies of studying the non-coding genome and argued that no single approach allows one to effectively identify the causal regulatory variants from GWAS results. By illustrating the advantages and limitations of each method, our review potentially provided a guideline for taking a combinatorial approach to accurately predict, prioritize, and eventually experimentally validate the causal variants.

[1]  N. Ahituv,et al.  Decoding enhancers using massively parallel reporter assays. , 2015, Genomics.

[2]  Mark I. McCarthy,et al.  A Genome-Wide Association Study Identifies Protein Quantitative Trait Loci (pQTLs) , 2008, PLoS genetics.

[3]  P. Deloukas,et al.  A GWAS sequence variant for platelet volume marks an alternative DNM3 promoter in megakaryocytes near a MEIS1 binding site. , 2012, Blood.

[4]  Jacob C. Ulirsch,et al.  Systematic Functional Dissection of Common Genetic Variation Affecting Red Blood Cell Traits , 2016, Cell.

[5]  J. Stamatoyannopoulos,et al.  DNA methylation status predicts cell type‐specific enhancer activity , 2011, The EMBO journal.

[6]  Philip A. Ewels,et al.  Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C , 2015, Nature Genetics.

[7]  C. Glass,et al.  Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. , 2010, Molecular cell.

[8]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[9]  L. Dimeglio,et al.  Type 1 Diabetes , 2019, Epidemiology of Diabetes.

[10]  Melissa A. Basford,et al.  Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data , 2013, Nature Biotechnology.

[11]  S. Fisher,et al.  Conservation of RET Regulatory Function from Human to Zebrafish Without Sequence Similarity , 2006, Science.

[12]  Jill M Dowen,et al.  Control of Cell Identity Genes Occurs in Insulated Neighborhoods in Mammalian Chromosomes , 2014, Cell.

[13]  Martha L. Bulyk,et al.  Machine learning classification of cell-specific cardiac enhancers uncovers developmental subnetworks regulating progenitor cell division and cell fate specification , 2014, Development.

[14]  Mulin Jun Li,et al.  Nature Genetics Advance Online Publication a N a Ly S I S the Support of Human Genetic Evidence for Approved Drug Indications , 2022 .

[15]  Timothy J. Durham,et al.  "Systematic" , 1966, Comput. J..

[16]  T. Mikkelsen,et al.  Systematic dissection of regulatory motifs in 2000 predicted human enhancers using a massively parallel reporter assay. , 2013, Genome research.

[17]  Manolis Kellis,et al.  FTO Obesity Variant Circuitry and Adipocyte Browning in Humans. , 2015, The New England journal of medicine.

[18]  M. Mannis,et al.  International eye banking and the Eye Bank Association of America (EBAA). , 1991, Refractive & corneal surgery.

[19]  Eurie L. Hong,et al.  Annotation of functional variation in personal genomes using RegulomeDB , 2012, Genome research.

[20]  N. Cox,et al.  Obesity-associated variants within FTO form long-range functional connections with IRX3 , 2014, Nature.

[21]  Michael Q. Zhang,et al.  CRISPR Inversion of CTCF Sites Alters Genome Topology and Enhancer/Promoter Function , 2015, Cell.

[22]  M. Gerstein,et al.  Unlocking the secrets of the genome , 2009, Nature.

[23]  Ellen T. Gelfand,et al.  The Genotype-Tissue Expression (GTEx) project , 2013, Nature Genetics.

[24]  Nathaniel D. Heintzman,et al.  9p21 DNA variants associated with Coronary Artery Disease impair IFNγ signaling response , 2011, Nature.

[25]  T. Meehan,et al.  An atlas of active enhancers across human cell types and tissues , 2014, Nature.

[26]  Howard Y. Chang,et al.  ATAC‐seq: A Method for Assaying Chromatin Accessibility Genome‐Wide , 2015, Current protocols in molecular biology.

[27]  R. Young,et al.  Histone H3K27ac separates active from poised enhancers and predicts developmental state , 2010, Proceedings of the National Academy of Sciences.

[28]  G. Kreiman,et al.  Widespread transcription at neuronal activity-regulated enhancers , 2010, Nature.

[29]  Manolis Kellis,et al.  Fine mapping of type 1 diabetes susceptibility loci and evidence for colocalization of causal variants with lymphoid gene enhancers , 2015, Nature Genetics.

[30]  André L. Martins,et al.  Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers , 2014, Nature Genetics.

[31]  G. Crawford,et al.  DNase-seq: a high-resolution technique for mapping active gene regulatory elements across the genome from mammalian cells. , 2010, Cold Spring Harbor protocols.

[32]  Thomas Lengauer,et al.  BLUEPRINT to decode the epigenetic signature written in blood , 2012, Nature Biotechnology.

[33]  Yakir A Reshef,et al.  Partitioning heritability by functional annotation using genome-wide association summary statistics , 2015, Nature Genetics.

[34]  R. Gibbs,et al.  Whole genome sequence analysis of serum amino acid levels , 2016, Genome Biology.

[35]  D. Durocher,et al.  High-Resolution CRISPR Screens Reveal Fitness Genes and Genotype-Specific Cancer Liabilities , 2015, Cell.

[36]  Philip Cayting,et al.  An encyclopedia of mouse DNA elements (Mouse ENCODE) , 2012, Genome Biology.

[37]  Wenjie Chen,et al.  GRASP v2.0: an update on the Genome-Wide Repository of Associations between SNPs and phenotypes , 2014, Nucleic Acids Res..

[38]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[39]  C. Glass,et al.  Reprogramming Transcription via Distinct Classes of Enhancers Functionally Defined by eRNA , 2011, Nature.

[40]  Eric S. Lander,et al.  Direct Identification of Hundreds of Expression-Modulating Variants using a Multiplexed Reporter Assay , 2016, Cell.

[41]  R. Jaenisch,et al.  One-Step Generation of Mice Carrying Reporter and Conditional Alleles by CRISPR/Cas-Mediated Genome Engineering , 2013, Cell.

[42]  Manolis Kellis,et al.  Conserved epigenomic signals in mice and humans reveal immune basis of Alzheimer’s disease , 2015, Nature.

[43]  J. Lieb,et al.  What are super-enhancers? , 2014, Nature Genetics.

[44]  Ayellet V. Segrè,et al.  Colocalization of GWAS and eQTL Signals Detects Target Genes , 2016, bioRxiv.

[45]  Jonathan K. Pritchard,et al.  The Genetic and Mechanistic Basis for Variation in Gene Regulation , 2015, PLoS genetics.

[46]  Elhanan Borenstein,et al.  Conservation of trans-acting circuitry during mammalian regulatory evolution , 2014, Nature.

[47]  T. Mikkelsen,et al.  The NIH Roadmap Epigenomics Mapping Consortium , 2010, Nature Biotechnology.

[48]  L. Kruglyak,et al.  The role of regulatory variation in complex traits and disease , 2015, Nature Reviews Genetics.

[49]  Toshitsugu Fujita,et al.  Efficient isolation of specific genomic regions and identification of associated proteins by engineered DNA-binding molecule-mediated chromatin immunoprecipitation (enChIP) using CRISPR. , 2013, Biochemical and biophysical research communications.

[50]  Charles Y. Lin,et al.  Convergence of developmental and oncogenic signaling pathways at transcriptional super-enhancers. , 2015, Molecular cell.

[51]  A. Ashworth,et al.  Unbiased analysis of potential targets of breast cancer susceptibility loci by Capture Hi-C , 2014, Genome research.

[52]  David A. Orlando,et al.  Selective Inhibition of Tumor Oncogenes by Disruption of Super-Enhancers , 2013, Cell.

[53]  Neville E. Sanjana,et al.  Genome-Scale CRISPR-Cas9 Knockout Screening in Human Cells , 2014, Science.

[54]  Olle Melander,et al.  From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus , 2010, Nature.

[55]  J. Dekker,et al.  The long-range interaction landscape of gene promoters , 2012, Nature.

[56]  Han Xu,et al.  Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. , 2014, American journal of human genetics.

[57]  R. Young,et al.  Super-Enhancers in the Control of Cell Identity and Disease , 2013, Cell.

[58]  Max A. Horlbeck,et al.  Genome-Scale CRISPR-Mediated Control of Gene Repression and Activation , 2014, Cell.

[59]  M. Lupien,et al.  Combinatorial effects of multiple enhancer variants in linkage disequilibrium dictate levels of gene expression to confer susceptibility to common traits , 2014, Genome research.

[60]  B. Oostra,et al.  A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. , 2003, Human molecular genetics.

[61]  P. Stankiewicz,et al.  Whole-genome sequencing in a patient with Charcot-Marie-Tooth neuropathy. , 2010, The New England journal of medicine.

[62]  Sebo Withoff,et al.  Genetic variation in the non-coding genome: Involvement of micro-RNAs and long non-coding RNAs in disease. , 2014, Biochimica et biophysica acta.

[63]  Łukasz M. Boryń,et al.  Genome-Wide Quantitative Enhancer Activity Maps Identified by STARR-seq , 2013, Science.

[64]  J. Telenius,et al.  Multiplexed analysis of chromosome conformation at vastly improved sensitivity , 2015, Nature Methods.

[65]  Ewan Birney,et al.  FORGE: A tool to discover cell specific enrichments of GWAS associated SNPs in regulatory regions , 2014, bioRxiv.

[66]  David A. Orlando,et al.  Master Transcription Factors and Mediator Establish Super-Enhancers at Key Cell Identity Genes , 2013, Cell.

[67]  L. Liang,et al.  Mapping complex disease traits with global gene expression , 2009, Nature Reviews Genetics.

[68]  Dorothy A. Thompson,et al.  Comprehensive Rare Variant Analysis via Whole-Genome Sequencing to Determine the Molecular Pathology of Inherited Retinal Disease. , 2017, American journal of human genetics.

[69]  A. Visel,et al.  Disruptions of Topological Chromatin Domains Cause Pathogenic Rewiring of Gene-Enhancer Interactions , 2015, Cell.

[70]  Manolis Kellis,et al.  HaploReg v4: systematic mining of putative causal variants, cell types, regulators and target genes for human complex traits and disease , 2015, Nucleic Acids Res..

[71]  Alexander Gusev,et al.  Functional Architectures of Local and Distal Regulation of Gene Expression in Multiple Human Tissues. , 2017, American journal of human genetics.

[72]  J. Banerji,et al.  Expression of a β-globin gene is enhanced by remote SV40 DNA sequences , 1981, Cell.

[73]  J. Banerji,et al.  Expression of a beta-globin gene is enhanced by remote SV40 DNA sequences. , 1981, Cell.

[74]  S. Batzoglou,et al.  Linking disease associations with regulatory information in the human genome , 2012, Genome research.

[75]  Axel Visel,et al.  Tissue-Specific RNA Expression Marks Distant-Acting Developmental Enhancers , 2014, PLoS genetics.

[76]  Kaanan P. Shah,et al.  A gene-based association method for mapping traits using reference transcriptome data , 2015, Nature Genetics.

[77]  Max A. Horlbeck,et al.  Compact and highly active next-generation libraries for CRISPR-mediated gene repression and activation , 2016, eLife.

[78]  Sean D. Taverna,et al.  A CRISPR-based approach for proteomic analysis of a single genomic locus , 2014, Epigenetics.

[79]  Michael Q. Zhang,et al.  Integrative analysis of 111 reference human epigenomes , 2015, Nature.

[80]  Le Cong,et al.  Multiplex Genome Engineering Using CRISPR/Cas Systems , 2013, Science.

[81]  M. Daly,et al.  Genetic and Epigenetic Fine-Mapping of Causal Autoimmune Disease Variants , 2014, Nature.

[82]  Jake K. Byrnes,et al.  Bayesian refinement of association signals for 14 loci in 3 common diseases , 2012, Nature Genetics.

[83]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[84]  Gerald Stampfel,et al.  Dissection of thousands of cell type-specific enhancers identifies dinucleotide repeat motifs as general enhancer features , 2014, Genome research.

[85]  Pedro Olivares-Chauvet,et al.  UMI-4C for quantitative and targeted chromosomal contact profiling , 2016, Nature Methods.

[86]  J. Shendure,et al.  A general framework for estimating the relative pathogenicity of human genetic variants , 2014, Nature Genetics.

[87]  R. Young,et al.  An oncogenic super-enhancer formed through somatic mutation of a noncoding intergenic element , 2014, Science.

[88]  G. Lettre,et al.  Rare variant association studies: considerations, challenges and opportunities , 2015, Genome Medicine.

[89]  Liang Li,et al.  Large genomic fragment deletion and functional gene cassette knock-in via Cas9 protein mediated genome editing in one-cell rodent embryos , 2015, Scientific Reports.

[90]  A. Pombo,et al.  Three-dimensional genome architecture: players and mechanisms , 2015, Nature Reviews Molecular Cell Biology.

[91]  Nathaniel D. Heintzman,et al.  Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome , 2007, Nature Genetics.

[92]  David J. Arenillas,et al.  JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles , 2015, Nucleic Acids Res..

[93]  A. Stark,et al.  Transcriptional enhancers: from properties to genome-wide predictions , 2014, Nature Reviews Genetics.

[94]  Andrew D. Johnson,et al.  Synthesis of 53 tissue and cell line expression QTL datasets reveals master eQTLs , 2014, BMC Genomics.

[95]  Richard Leslie,et al.  GRASP: analysis of genotype-phenotype results from 1390 genome-wide association studies and corresponding open access database , 2014, Bioinform..

[96]  Y. Li,et al.  Trans-ethnic genome-wide association studies: advantages and challenges of mapping in diverse populations , 2014, Genome Medicine.