Systematic identification of regulatory variants associated with cancer risk

BackgroundMost cancer risk-associated single nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) are noncoding and it is challenging to assess their functional impacts. To systematically identify the SNPs that affect gene expression by modulating activities of distal regulatory elements, we adapt the self-transcribing active regulatory region sequencing (STARR-seq) strategy, a high-throughput technique to functionally quantify enhancer activities.ResultsFrom 10,673 SNPs linked with 996 cancer risk-associated SNPs identified in previous GWAS studies, we identify 575 SNPs in the fragments that positively regulate gene expression, and 758 SNPs in the fragments with negative regulatory activities. Among them, 70 variants are regulatory variants for which the two alleles confer different regulatory activities. We analyze in depth two regulatory variants—breast cancer risk SNP rs11055880 and leukemia risk-associated SNP rs12142375—and demonstrate their endogenous regulatory activities on expression of ATF7IP and PDE4B genes, respectively, using a CRISPR-Cas9 approach.ConclusionsBy identifying regulatory variants associated with cancer susceptibility and studying their molecular functions, we hope to help the interpretation of GWAS results and provide improved information for cancer risk assessment.

[1]  K. Savage,et al.  The phosphodiesterase PDE4B limits cAMP-associated PI3K/AKT-dependent apoptosis in diffuse large B-cell lymphoma. , 2005, Blood.

[2]  R. Aguiar,et al.  A phosphodiesterase 4B-dependent interplay between tumor cells and the microenvironment regulates angiogenesis in B-cell lymphoma , 2015, Leukemia.

[3]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[4]  B. L. Bálint,et al.  Genome Wide Mapping Reveals PDE4B as an IL-2 Induced STAT5 Target Gene in Activated Human PBMCs and Lymphoid Cancer Cells , 2013, PloS one.

[5]  Michael Q. Zhang,et al.  Integrative analysis of 111 reference human epigenomes , 2015, Nature.

[6]  Neville E. Sanjana,et al.  Improved vectors and genome-wide libraries for CRISPR screening , 2014, Nature Methods.

[7]  Bing Li,et al.  The Role of Chromatin during Transcription , 2007, Cell.

[8]  Heng Li,et al.  A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data , 2011, Bioinform..

[9]  Kai Zhang,et al.  A prostate cancer susceptibility allele at 6q22 increases RFX6 expression by modulating HOXB13 chromatin binding , 2014, Nature Genetics.

[10]  Peggy J. Farnham,et al.  Functional annotation of colon cancer risk SNPs , 2014, Nature Communications.

[11]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[12]  Cheng Cheng,et al.  Genome-wide association study identifies germline polymorphisms associated with relapse of childhood acute lymphoblastic leukemia. , 2012, Blood.

[13]  Neville E. Sanjana,et al.  Genome-Scale CRISPR-Cas9 Knockout Screening in Human Cells , 2014, Science.

[14]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[15]  Lan T M Dao,et al.  High-throughput and quantitative assessment of enhancer activity in mammals by CapStarr-seq , 2015, Nature Communications.

[16]  Shane J. Neph,et al.  Systematic Localization of Common Disease-Associated Variation in Regulatory DNA , 2012, Science.

[17]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[18]  J. Lindberg,et al.  Gene regulatory mechanisms underpinning prostate cancer susceptibility , 2016, Nature Genetics.

[19]  V. Corces,et al.  CTCF: an architectural protein bridging genome topology and function , 2014, Nature Reviews Genetics.

[20]  M. Daly,et al.  Genetic and Epigenetic Fine-Mapping of Causal Autoimmune Disease Variants , 2014, Nature.

[21]  William Stafford Noble,et al.  FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[22]  Jidong Zhu,et al.  Histone methyltransferase SETDB1 regulates liver cancer cell growth through methylation of p53 , 2015, Nature Communications.

[23]  A. McKenna,et al.  Integrative eQTL-Based Analyses Reveal the Biology of Breast Cancer Risk Loci , 2013, Cell.

[24]  Roby Joehanes,et al.  Integrated genome-wide analysis of expression quantitative trait loci aids interpretation of genomic association studies , 2017, Genome Biology.

[25]  B. Stranger,et al.  Expression QTL-based analyses reveal candidate causal genes and loci across five tumor types. , 2014, Human molecular genetics.

[26]  William H. Majoros,et al.  Massively parallel quantification of the regulatory effects of noncoding genetic variation in a human cohort , 2015, Genome research.

[27]  N. Cox,et al.  Trait-Associated SNPs Are More Likely to Be eQTLs: Annotation to Enhance Discovery from GWAS , 2010, PLoS genetics.

[28]  Jesse R. Dixon,et al.  Topological Domains in Mammalian Genomes Identified by Analysis of Chromatin Interactions , 2012, Nature.

[29]  Vladimir B. Bajic,et al.  HOCOMOCO: expansion and enhancement of the collection of transcription factor binding sites models , 2015, Nucleic Acids Res..

[30]  M. Rosenfeld,et al.  Enhancers as non-coding RNA transcription units: recent insights and future perspectives , 2016, Nature Reviews Genetics.

[31]  G. Dougan,et al.  ATF7IP-Mediated Stabilization of the Histone Methyltransferase SETDB1 Is Essential for Heterochromatin Formation by the HUSH Complex , 2016, Cell reports.

[32]  S. Batzoglou,et al.  Linking disease associations with regulatory information in the human genome , 2012, Genome research.

[33]  Luke A. Gilbert,et al.  CRISPR interference (CRISPRi) for sequence-specific control of gene expression , 2013, Nature Protocols.

[34]  Łukasz M. Boryń,et al.  Genome-Wide Quantitative Enhancer Activity Maps Identified by STARR-seq , 2013, Science.

[35]  Jacob C. Ulirsch,et al.  Systematic Functional Dissection of Common Genetic Variation Affecting Red Blood Cell Traits , 2016, Cell.

[36]  K. Zaret,et al.  H3K9me3-Dependent Heterochromatin: Barrier to Cell Fate Changes. , 2016, Trends in genetics : TIG.

[37]  James T Kadonaga,et al.  Rational design of a super core promoter that enhances gene expression , 2006, Nature Methods.

[38]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[39]  Eric S. Lander,et al.  Direct Identification of Hundreds of Expression-Modulating Variants using a Multiplexed Reporter Assay , 2016, Cell.

[40]  Manolis Kellis,et al.  Interpreting non-coding variation in complex disease genetics , 2012, Nature Biotechnology.

[41]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[42]  Joseph B Hiatt,et al.  Massively parallel functional dissection of mammalian enhancers in vivo , 2012, Nature Biotechnology.

[43]  Wen-Hsiung Li,et al.  Human polymorphism at microRNAs and microRNA target sites , 2007, Proceedings of the National Academy of Sciences.

[44]  Raymond K. Auerbach,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[45]  Yanli Wang,et al.  Topologically associating domains are stable units of replication-timing regulation , 2014, Nature.

[46]  M. Lupien,et al.  Combinatorial effects of multiple enhancer variants in linkage disequilibrium dictate levels of gene expression to confer susceptibility to common traits , 2014, Genome research.

[47]  Andrew C. Wood,et al.  Genetic predisposition to neuroblastoma mediated by a LMO1 super-enhancer polymorphism , 2015, Nature.

[48]  Jaie C. Woodard,et al.  Survey of variation in human transcription factors reveals prevalent DNA binding changes , 2016, Science.

[49]  Luke A. Gilbert,et al.  CRISPR-Mediated Modular RNA-Guided Regulation of Transcription in Eukaryotes , 2013, Cell.

[50]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[51]  P. Gao,et al.  Genomic Insight into the Role of lncRNAs in Cancer Susceptibility , 2017, International journal of molecular sciences.

[52]  C. Allis,et al.  The molecular hallmarks of epigenetic control , 2016, Nature Reviews Genetics.

[53]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[54]  Yusuke Nakamura,et al.  Genome-Wide Association Study of Breast Cancer in the Japanese Population , 2013, PloS one.

[55]  Jie Zhou Functional genomic analysis of nuclear receptors in MCF7 cells , 2014 .

[56]  D. Xie,et al.  SETDB1 accelerates tumourigenesis by regulating the WNT signalling pathway , 2015, Journal of Pathology.

[57]  S. Shurtleff,et al.  Clinical utility of microarray-based gene expression profiling in the diagnosis and subclassification of leukemia: report from the International Microarray Innovations in Leukemia Study Group. , 2010, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.