Functional interpretation of non-coding sequence variation: Concepts and challenges

Understanding the functional mechanisms underlying genetic signals associated with complex traits and common diseases, such as cancer, diabetes and Alzheimer's disease, is a formidable challenge. Many genetic signals discovered through genome‐wide association studies map to non‐protein coding sequences, where their molecular consequences are difficult to evaluate. This article summarizes concepts for the systematic interpretation of non‐coding genetic signals using genome annotation data sets in different cellular systems. We outline strategies for the global analysis of multiple association intervals and the in‐depth molecular investigation of individual intervals. We highlight experimental techniques to validate candidate (potential causal) regulatory variants, with a focus on novel genome‐editing techniques including CRISPR/Cas9. These approaches are also applicable to low‐frequency and rare variants, which have become increasingly important in genomic studies of complex traits and diseases. There is a pressing need to translate genetic signals into biological mechanisms, leading to prognostic, diagnostic and therapeutic advances.

[1]  D. Goldstein,et al.  Uncovering the roles of rare variants in common disease through whole-genome sequencing , 2010, Nature Reviews Genetics.

[2]  G. Church,et al.  Cas9 as a versatile tool for engineering biology , 2013, Nature Methods.

[3]  Matthew C. Canver,et al.  An Erythroid Enhancer of BCL11A Subject to Genetic Variation Determines Fetal Hemoglobin Level , 2013, Science.

[4]  M. McCarthy,et al.  Genome-wide association studies for complex traits: consensus, uncertainty and challenges , 2008, Nature Reviews Genetics.

[5]  A. Singleton,et al.  Towards a complete resolution of the genetic architecture of disease. , 2010, Trends in genetics : TIG.

[6]  E. Dermitzakis,et al.  Candidate Causal Regulatory Effects by Integration of Expression QTLs with Complex Trait Genetic Associations , 2010, PLoS genetics.

[7]  E. Birney Chromatin and heritability: how epigenetic studies can complement genetic approaches. , 2011, Trends in genetics : TIG.

[8]  Jason H. Moore,et al.  Missing heritability and strategies for finding the underlying causes of complex disease , 2010, Nature Reviews Genetics.

[9]  William Stafford Noble,et al.  Integrative annotation of chromatin elements from ENCODE data , 2012, Nucleic acids research.

[10]  Shane J. Neph,et al.  Systematic Localization of Common Disease-Associated Variation in Regulatory DNA , 2012, Science.

[11]  Timothy J. Durham,et al.  Systematic analysis of chromatin state dynamics in nine human cell types , 2011, Nature.

[12]  Christian Gieger,et al.  New gene functions in megakaryopoiesis and platelet formation , 2011, Nature.

[13]  Olle Melander,et al.  From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus , 2010, Nature.

[14]  C. Kratz,et al.  Faculty Opinions recommendation of Compound inheritance of a low-frequency regulatory SNP and a rare null mutation in exon-junction complex subunit RBM8A causes TAR syndrome. , 2012 .

[15]  Sarah Edkins,et al.  Dense genotyping identifies and localizes multiple common and rare variant association signals in celiac disease , 2011, Nature Genetics.

[16]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[17]  Xia Yang,et al.  Integrating pathway analysis and genetics of gene expression for genome-wide association studies. , 2010, American journal of human genetics.

[18]  Manolis Kellis,et al.  Discovery and characterization of chromatin states for systematic annotation of the human genome , 2010, Nature Biotechnology.

[19]  T. Mikkelsen,et al.  The NIH Roadmap Epigenomics Mapping Consortium , 2010, Nature Biotechnology.

[20]  Nathan C. Sheffield,et al.  The accessible chromatin landscape of the human genome , 2012, Nature.

[21]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[22]  Michael R. Green,et al.  Transcriptional regulatory elements in the human genome. , 2006, Annual review of genomics and human genetics.

[23]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[24]  D. Carroll Staying on target with CRISPR-Cas , 2013, Nature Biotechnology.

[25]  G. Abecasis,et al.  Genotype imputation. , 2009, Annual review of genomics and human genetics.

[26]  N. Cox,et al.  Trait-Associated SNPs Are More Likely to Be eQTLs: Annotation to Enhance Discovery from GWAS , 2010, PLoS genetics.

[27]  Buhm Han,et al.  Chromatin marks identify critical cell types for fine mapping complex trait variants , 2012 .

[28]  Matthew Hardy,et al.  Cell-specific protein phenotypes for the autoimmune locus IL2RA using a genotype-selectable human bioresource , 2009, Nature Genetics.

[29]  Daniel J. Gaffney,et al.  Dense fine-mapping study identifies new susceptibility loci for primary biliary cirrhosis , 2012, Nature Genetics.

[30]  Manolis Kellis,et al.  Interplay between chromatin state, regulator binding, and regulatory motifs in six human cell types , 2013, Genome research.

[31]  Karen L. Mohlke,et al.  A map of open chromatin in human pancreatic islets , 2010, Nature Genetics.

[32]  Berthold Göttgens,et al.  Maps of Open Chromatin Guide the Functional Follow-Up of Genome-Wide Association Signals: Application to Hematological Traits , 2011, PLoS genetics.

[33]  Tariq Ahmad,et al.  Human SNP Links Differential Outcomes in Inflammatory and Infectious Disease to a FOXO3-Regulated Pathway , 2013, Cell.

[34]  P. Deloukas,et al.  Common Regulatory Variation Impacts Gene Expression in a Cell Type–Dependent Manner , 2009, Science.

[35]  S. Sebert,et al.  Nutritional programming of the metabolic syndrome , 2009, Nature Reviews Endocrinology.

[36]  Martin J. Aryee,et al.  Epigenome-wide association data implicate DNA methylation as an intermediary of genetic risk in Rheumatoid Arthritis , 2013, Nature Biotechnology.

[37]  T. Pastinen Genome-wide allele-specific analysis: insights into regulatory variation , 2010, Nature Reviews Genetics.

[38]  Nathan C. Sheffield,et al.  Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity. , 2011, Genome research.

[39]  P. Visscher,et al.  Five years of GWAS discovery. , 2012, American journal of human genetics.

[40]  Le Cong,et al.  Multiplex Genome Engineering Using CRISPR/Cas Systems , 2013, Science.

[41]  A. Visel,et al.  Large-Scale Discovery of Enhancers from Human Heart Tissue , 2011, Nature Genetics.

[42]  B. L,et al.  The accessible chromatin landscape of the human genome , 2016 .

[43]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[44]  Chad A. Cowan,et al.  Enhanced efficiency of human pluripotent stem cell genome editing through replacing TALENs with CRISPRs. , 2013, Cell stem cell.

[45]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[46]  Manolis Kellis,et al.  Interpreting non-coding variation in complex disease genetics , 2012, Nature Biotechnology.

[47]  H. Hakonarson,et al.  Analysing biological pathways in genome-wide association studies , 2010, Nature Reviews Genetics.

[48]  Elo Leung,et al.  A TALE nuclease architecture for efficient genome editing , 2011, Nature Biotechnology.

[49]  R. Hardison,et al.  GWAS to Therapy by Genome Edits? , 2013, Science.

[50]  F. J. Livesey,et al.  Directed differentiation of human pluripotent stem cells to cerebral cortex neurons and neural networks , 2012, Nature Protocols.

[51]  Thomas A. Down,et al.  Identification of Type 1 Diabetes–Associated DNA Methylation Variable Positions That Precede Disease Diagnosis , 2010, PLoS genetics.

[52]  P. Deloukas,et al.  Maps of open chromatin highlight cell type–restricted patterns of regulatory sequence variation at hematological trait loci , 2013, Genome research.

[53]  Morgan L. Maeder,et al.  CRISPR RNA-guided activation of endogenous human genes , 2013, Nature Methods.

[54]  Sonia Shah,et al.  Use of Allele-Specific FAIRE to Determine Functional Regulatory Polymorphism Using Large-Scale Genotyping Arrays , 2012, PLoS genetics.

[55]  Nicola K. Wilson,et al.  Long-range DNA looping and gene expression analyses identify DEXI as an autoimmune disease candidate gene , 2011, Human molecular genetics.

[56]  Peter Donnelly,et al.  Progress and challenges in genome-wide association studies in humans , 2008, Nature.

[57]  Rudolf Jaenisch,et al.  One-Step Generation of Mice Carrying Mutations in Multiple Genes by CRISPR/Cas-Mediated Genome Engineering , 2013, Cell.

[58]  John R. W. Masters,et al.  Human cancer cell lines: fact and fantasy , 2000, Nature Reviews Molecular Cell Biology.

[59]  J. Walker Linkage Disequilibrium and Association Mapping , 2007, Methods in Molecular Biology™.

[60]  Eurie L. Hong,et al.  Annotation of functional variation in personal genomes using RegulomeDB , 2012, Genome research.

[61]  Alexander Meissner,et al.  Browsing (Epi)genomes: a guide to data resources and epigenome browsers for stem cell researchers. , 2013, Cell stem cell.

[62]  Matthew J. Moscou,et al.  A Simple Cipher Governs DNA Recognition by TAL Effectors , 2009, Science.

[63]  David Haussler,et al.  The Human Epigenome Browser at Washington University , 2011, Nature Methods.

[64]  Howard Y. Chang,et al.  Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position , 2013, Nature Methods.

[65]  Timothy J. Durham,et al.  "Systematic" , 1966, Comput. J..

[66]  A. Bradley,et al.  Targeted gene correction of α1-antitrypsin deficiency in induced pluripotent stem cells , 2011, Nature.

[67]  David C. Wilson,et al.  Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease , 2012, Nature.

[68]  P. Deloukas,et al.  A GWAS sequence variant for platelet volume marks an alternative DNM3 promoter in megakaryocytes near a MEIS1 binding site. , 2012, Blood.

[69]  K. Dewar,et al.  Allele-specific chromatin remodeling in the ZPBP2/GSDMB/ORMDL3 locus associated with the risk of asthma and autoimmune disease. , 2009, American journal of human genetics.

[70]  Nathaniel D. Heintzman,et al.  9p21 DNA variants associated with Coronary Artery Disease impair IFNγ signaling response , 2011, Nature.

[71]  Esko Ukkonen,et al.  The common colorectal cancer predisposition SNP rs6983267 at chromosome 8q24 confers potential to enhanced Wnt signaling , 2009, Nature Genetics.

[72]  Jens Boch,et al.  Breaking the Code of DNA Binding Specificity of TAL-Type III Effectors , 2009, Science.

[73]  John A. Todd,et al.  Proteome-Wide Analysis of Disease-Associated SNPs That Show Allele-Specific Transcription Factor Binding , 2012, PLoS genetics.

[74]  Shane J. Neph,et al.  An expansive human regulatory lexicon encoded in transcription factor footprints , 2012, Nature.

[75]  Simon C. Potter,et al.  Mapping cis- and trans-regulatory effects across multiple tissues in twins , 2012, Nature Genetics.

[76]  Ian M. Morison,et al.  Integrated Genetic and Epigenetic Analysis Identifies Haplotype-Specific Methylation in the FTO Type 2 Diabetes and Obesity Susceptibility Locus , 2010, PloS one.

[77]  Peter Kirwan,et al.  Human cerebral cortex development from pluripotent stem cells to functional excitatory synapses , 2012, Nature Neuroscience.

[78]  Luke A. Gilbert,et al.  CRISPR-Mediated Modular RNA-Guided Regulation of Transcription in Eukaryotes , 2013, Cell.

[79]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[80]  D. Balding,et al.  Epigenome-wide association studies for common human diseases , 2011, Nature Reviews Genetics.

[81]  Randall J. Platt,et al.  Optical Control of Mammalian Endogenous Transcription and Epigenetic States , 2013, Nature.

[82]  A. Visel,et al.  ChIP-seq accurately predicts tissue-specific activity of enhancers , 2009, Nature.

[83]  Lincoln Stein,et al.  Reactome: a database of reactions, pathways and biological processes , 2010, Nucleic Acids Res..

[84]  Thomas Lengauer,et al.  BLUEPRINT to decode the epigenetic signature written in blood , 2012, Nature Biotechnology.

[85]  Biotechnology: Programming genomes with light , 2013, Nature.

[86]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[87]  Elaine R. Mardis,et al.  A decade’s perspective on DNA sequencing technology , 2011, Nature.

[88]  James E. DiCarlo,et al.  RNA-Guided Human Genome Engineering via Cas9 , 2013, Science.

[89]  Jingyuan Fu,et al.  Human Disease-Associated Genetic Variation Impacts Large Intergenic Non-Coding RNA Expression , 2013, PLoS genetics.

[90]  T. Mikkelsen,et al.  Systematic dissection of regulatory motifs in 2000 predicted human enhancers using a massively parallel reporter assay. , 2013, Genome research.

[91]  Teri A. Manolio,et al.  Bringing genome-wide association findings into clinical use , 2013, Nature Reviews Genetics.

[92]  D. Balding,et al.  Integrated analysis of genome-wide genetic and epigenetic association data for identification of disease mechanisms , 2013, Epigenetics.