The non-coding variant rs1800734 enhances DCLK3 expression through long-range interaction and promotes colorectal cancer progression

Genome-wide association studies have identified a great number of non-coding risk variants for colorectal cancer (CRC). To date, the majority of these variants have not been functionally studied. Identification of allele-specific transcription factor (TF) binding is of great importance to understand regulatory consequences of such variants. A recently developed proteome-wide analysis of disease-associated SNPs (PWAS) enables identification of TF-DNA interactions in an unbiased manner. Here we perform a large-scale PWAS study to comprehensively characterize TF-binding landscape that is associated with CRC, which identifies 731 allele-specific TF binding at 116 CRC risk loci. This screen identifies the A-allele of rs1800734 within the promoter region of MLH1 as perturbing the binding of TFAP4 and consequently increasing DCLK3 expression through a long-range interaction, which promotes cancer malignancy through enhancing expression of the genes related to epithelial-to-mesenchymal transition.

[1]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[2]  Joseph K. Pickrell,et al.  DNaseI sensitivity QTLs are a major determinant of human expression variation , 2011, Nature.

[3]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[4]  Jean-Baptiste Cazier,et al.  Meta-analysis of three genome-wide association studies identifies susceptibility loci for colorectal cancer at 1q41, 3q26.2, 12q13.13 and 20q13.33 , 2010, Nature Genetics.

[5]  Carl Kingsford,et al.  Higher-order chromatin domains link eQTLs with the expression of far-away genes , 2013, Nucleic acids research.

[6]  J. Błasiak,et al.  Polymorphism of DNA mismatch repair genes in endometrial cancer. , 2015, Experimental oncology.

[7]  D. Bishop,et al.  MLH1 −93G>A promoter polymorphism and risk of mismatch repair deficient colorectal cancer , 2008, International journal of cancer.

[8]  M. Ribeiro,et al.  Effect of MLH1 −93G>A on gene expression in patients with colorectal cancer , 2014, Medical Oncology.

[9]  Andrew C. Adey,et al.  Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition , 2010, Genome Biology.

[10]  H. Stunnenberg,et al.  A Polymorphic Enhancer near GREM1 Influences Bowel Cancer Risk through Differential CDX2 and TCF7L2 Binding , 2014, Cell reports.

[11]  David A. Scott,et al.  Genome engineering using the CRISPR-Cas9 system , 2013, Nature Protocols.

[12]  M. Vermeulen,et al.  CBFB–MYH11/RUNX1 together with a compendium of hematopoietic regulators, chromatin modifiers and basal transcription factors occupies self-renewal genes in inv(16) acute myeloid leukemia , 2013, Leukemia.

[13]  Clifford A. Meyer,et al.  Model-based Analysis of ChIP-Seq (MACS) , 2008, Genome Biology.

[14]  P. Broderick,et al.  The 14q22.2 colorectal cancer variant rs4444235 shows cis-acting regulation of BMP4 , 2012, Oncogene.

[15]  N. Friedman,et al.  Chromatin state dynamics during blood formation , 2014, Science.

[16]  Nan Guo,et al.  PANTHER version 6: protein sequence and function evolution data with expanded representation of biological pathways , 2006, Nucleic Acids Res..

[17]  R. Durbin,et al.  Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses , 2012, Nature Protocols.

[18]  Salam A. Assi,et al.  Depletion of RUNX1/ETO in t(8;21) AML cells leads to genome-wide changes in chromatin structure and transcription factor binding , 2012, Leukemia.

[19]  Nathan C. Sheffield,et al.  The accessible chromatin landscape of the human genome , 2012, Nature.

[20]  Emmanouil T. Dermitzakis,et al.  Fast and efficient QTL mapper for thousands of molecular phenotypes , 2015, bioRxiv.

[21]  David I. K. Martin,et al.  Germline epimutation of MLH1 in individuals with multiple cancers , 2004, Nature Genetics.

[22]  B. Bapat,et al.  Functional effects of the MLH1-93G>A polymorphism on MLH1/EPM2AIP1 promoter activity. , 2011, Oncology reports.

[23]  Steven Gallinger,et al.  Multiple Common Susceptibility Variants near BMP Pathway Loci GREM1, BMP4, and BMP2 Explain Part of the Missing Heritability of Colorectal Cancer , 2011, PLoS genetics.

[24]  R. Houlston,et al.  Capture Hi-C identifies the chromatin interactome of colorectal cancer risk loci , 2015, Nature Communications.

[25]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer , 2011, Nature Biotechnology.

[26]  H. Stunnenberg,et al.  A quantitative proteomics tool to identify DNA-protein interactions in primary cells or blood. , 2015, Journal of proteome research.

[27]  Chien-Jen Chen,et al.  Polymorphisms of MLH1 and MSH2 genes and the risk of lung cancer among never smokers. , 2011, Lung cancer.

[28]  I. Rodriguez-Hernandez,et al.  Analysis of DNA repair gene polymorphisms in glioblastoma. , 2014, Gene.

[29]  R. Xavier,et al.  Epigenetic programming of monocyte-to-macrophage differentiation and trained innate immunity , 2014, Science.

[30]  Juan M. Vaquerizas,et al.  DNA-Binding Specificities of Human Transcription Factors , 2013, Cell.

[31]  Lusy Handoko,et al.  Dynamic Reorganization of Extremely Long-Range Promoter-Promoter Interactions between Two States of Pluripotency. , 2015, Cell stem cell.

[32]  M. Mann,et al.  MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification , 2008, Nature Biotechnology.

[33]  A. Shilatifard,et al.  The MLL3/MLL4 Branches of the COMPASS Family Function as Major Histone H3K4 Monomethylases at Enhancers , 2013, Molecular and Cellular Biology.

[34]  V. Moreno,et al.  Multiple Functional Risk Variants in a SMAD7 Enhancer Implicate a Colorectal Cancer Risk Haplotype , 2014, PloS one.

[35]  Matthias Mann,et al.  A DNA-centric protein interaction map of ultraconserved elements reveals contribution of transcription factor binding hubs to conservation. , 2013, Cell reports.

[36]  P. Broderick,et al.  MLH1-93G > A is a risk factor for MSI colorectal cancer. , 2011, Carcinogenesis.

[37]  Jun Lu,et al.  CDK5 is essential for TGF-β1-induced epithelial-mesenchymal transition and breast cancer progression , 2013, Scientific Reports.

[38]  R. Kageyama,et al.  Dclk1 distinguishes between tumor and normal stem cells in the intestine , 2012, Nature Genetics.

[39]  P. Goodfellow,et al.  Evidence for heritable predisposition to epigenetic silencing of MLH1 , 2007, International journal of cancer.

[40]  Shane J. Neph,et al.  Systematic Localization of Common Disease-Associated Variation in Regulatory DNA , 2012, Science.

[41]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[42]  Emmanouil T. Dermitzakis,et al.  Putative cis-regulatory drivers in colorectal cancer , 2014, Nature.

[43]  Marco Y. Hein,et al.  The Perseus computational platform for comprehensive analysis of (prote)omics data , 2016, Nature Methods.

[44]  D. Kerr,et al.  Common genetic variants at the CRAC1 (HMPS) locus on chromosome 15q13.3 influence colorectal cancer risk , 2008, Nature Genetics.

[45]  L. Elia,et al.  A Cell-Based High-Content Screening Assay Reveals Activators and Inhibitors of Cancer Cell Invasion , 2011, Science Signaling.

[46]  S. Prabhakar,et al.  Sensitive detection of chromatin-altering polymorphisms reveals autoimmune disease mechanisms , 2015, Nature Methods.

[47]  Oliver Sieber,et al.  A genome-wide association study shows that common alleles of SMAD7 influence colorectal cancer risk , 2007, Nature Genetics.

[48]  I. Deary,et al.  Genome-wide association scan identifies a colorectal cancer susceptibility locus on 11q23 and replicates risk loci at 8q24 and 18q21 , 2008, Nature Genetics.

[49]  Meilin Wang,et al.  Functional annotation of colorectal cancer susceptibility loci identifies MLH1 rs1800734 associated with MSI patients , 2016, Gut.

[50]  Miguel Manzanares,et al.  Allelic Variation at the 8q23.3 Colorectal Cancer Risk Locus Functions as a Cis-Acting Regulator of EIF3H , 2010, PLoS genetics.

[51]  Roderic Guigó,et al.  The GEM mapper: fast, accurate and versatile alignment by filtration , 2012, Nature Methods.

[52]  Steven Gallinger,et al.  Meta-analysis of genome-wide association data identifies four new susceptibility loci for colorectal cancer , 2008, Nature Genetics.

[53]  Shane J. Neph,et al.  An expansive human regulatory lexicon encoded in transcription factor footprints , 2012, Nature.

[54]  M. Vermeulen Identifying chromatin readers using a SILAC-based histone peptide pull-down approach. , 2012, Methods in enzymology.

[55]  Raymond K. Auerbach,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[56]  Haplotype defined by the MLH1-93G/A polymorphism is associated with MLH1 promoter hypermethylation in sporadic colorectal cancers , 2014, BMC Research Notes.

[57]  John A. Todd,et al.  Proteome-Wide Analysis of Disease-Associated SNPs That Show Allele-Specific Transcription Factor Binding , 2012, PLoS genetics.

[58]  Cheng Li,et al.  Adjusting batch effects in microarray expression data using empirical Bayes methods. , 2007, Biostatistics.

[59]  Thomas Whitington,et al.  Transcription Factor Binding in Human Cells Occurs in Dense Clusters Formed around Cohesin Anchor Sites , 2013, Cell.

[60]  M. Varella‐Garcia,et al.  Association of the epithelial-to-mesenchymal transition phenotype with responsiveness to the p21-activated kinase inhibitor, PF-3758309, in colon cancer models , 2013, Front. Pharmacol..

[61]  Heng Li,et al.  Toward better understanding of artifacts in variant calling from high-coverage samples , 2014, Bioinform..

[62]  Timothy E. Reddy,et al.  Distinct properties of cell-type-specific and shared transcription factor binding sites. , 2013, Molecular cell.

[63]  Yonatan Stelzer,et al.  Parkinson-associated risk variant in enhancer element produces subtle effect on target gene expression , 2016, Nature.

[64]  Eric Haugen,et al.  Large-scale identification of sequence variants impacting human transcription factor occupancy in vivo , 2015, Nature Genetics.

[65]  G. Eichele,et al.  The evolving doublecortin (DCX) superfamily , 2006, BMC Genomics.

[66]  J. Knight,et al.  MLH1 –93G>A Promoter Polymorphism and the Risk of Microsatellite-Unstable Colorectal Cancer , 2007 .

[67]  Julian Peto,et al.  A genome-wide association study identifies colorectal cancer susceptibility loci on chromosomes 10p14 and 8q23.3 , 2008, Nature Genetics.

[68]  R. Aebersold,et al.  The protein interaction landscape of the human CMGC kinase group. , 2013, Cell reports.

[69]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[70]  J. Pike,et al.  VDR/RXR and TCF4/β-catenin cistromes in colonic cells of colorectal tumor origin: impact on c-FOS and c-MYC gene expression. , 2012, Molecular endocrinology.

[71]  Andreas H. Nuber,et al.  Long-lived intestinal tuft cells serve as colon cancer-initiating cells. , 2014, The Journal of clinical investigation.

[72]  Boris Lenhard,et al.  r3Cseq: an R/Bioconductor package for the discovery of long-range genomic interactions from chromosome conformation capture and next-generation sequencing data , 2013, Nucleic acids research.

[73]  J. Mesirov,et al.  The Molecular Signatures Database Hallmark Gene Set Collection , 2015 .

[74]  Stefan Schoenfelder,et al.  Capture Hi-C reveals novel candidate genes and complex long-range interactions with related autoimmune risk loci , 2015, Nature Communications.

[75]  Steven J. M. Jones,et al.  Comprehensive molecular characterization of human colon and rectal cancer , 2012, Nature.