DeepTACT: predicting 3D chromatin contacts via bootstrapping deep learning

Abstract Interactions between regulatory elements are of crucial importance for the understanding of transcriptional regulation and the interpretation of disease mechanisms. Hi-C technique has been developed for genome-wide detection of chromatin contacts. However, unless extremely deep sequencing is performed on a very large number of input cells, which is technically limited and expensive, current Hi-C experiments do not have high enough resolution to resolve contacts between regulatory elements. Here, we develop DeepTACT, a bootstrapping deep learning model, to integrate genome sequences and chromatin accessibility data for the prediction of chromatin contacts between regulatory elements. DeepTACT can infer not only promoter–enhancer interactions, but also promoter–promoter interactions. In tests based on promoter capture Hi-C data, DeepTACT shows better performance over existing methods. DeepTACT analysis also identifies a class of hub promoters, which are correlated with transcriptional activation across cell lines, enriched in housekeeping genes, functionally related to fundamental biological processes, and capable of reflecting cell similarity. Finally, the utility of chromatin contacts in the study of human diseases is illustrated by the association of IFNA2 to coronary artery disease via an integrative analysis of GWAS data and interactions predicted by DeepTACT.

[1]  J. Dekker,et al.  Capturing Chromosome Conformation , 2002, Science.

[2]  E. Levanon,et al.  Human housekeeping genes are compact. , 2003, Trends in genetics : TIG.

[3]  B Marshall,et al.  Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource , 2004, Nucleic Acids Res..

[4]  Paul T. Groth,et al.  The ENCODE (ENCyclopedia Of DNA Elements) Project , 2004, Science.

[5]  G. Hansson,et al.  Enhanced T-Cell Expression of RANK Ligand in Acute Coronary Syndrome: Possible Role in Plaque Destabilization , 2006, Arteriosclerosis, thrombosis, and vascular biology.

[6]  E. Benevolenskaya,et al.  Histone H3K4 demethylases are essential in development and differentiation. , 2007, Biochemistry and cell biology = Biochimie et biologie cellulaire.

[7]  Dustin E. Schones,et al.  High-Resolution Profiling of Histone Methylations in the Human Genome , 2007, Cell.

[8]  Shane C. Dillon,et al.  The landscape of histone modifications across 1% of the human genome in five human cell lines. , 2007, Genome research.

[9]  Yoshihiro Yamanishi,et al.  KEGG for linking genomes to life and the environment , 2007, Nucleic Acids Res..

[10]  E. Liu,et al.  An Oestrogen Receptor α-bound Human Chromatin Interactome , 2009, Nature.

[11]  Svetlana Segarceanu,et al.  ProtoLOGOS, system for Romanian language automatic speech recognition and understanding (ASRU) , 2009, 2009 Proceedings of the 5-th Conference on Speech Technology and Human-Computer Dialogue.

[12]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[13]  L. Peltonen,et al.  A multilocus genetic risk score for coronary heart disease: case-control and prospective cohort analyses , 2010, The Lancet.

[14]  R. Young,et al.  Histone H3K27ac separates active from poised enhancers and predicts developmental state , 2010, Proceedings of the National Academy of Sciences.

[15]  Nathaniel D. Heintzman,et al.  9p21 DNA variants associated with Coronary Artery Disease impair IFNγ signaling response , 2011, Nature.

[16]  W. Kraus,et al.  A Common Variant in the CDKN2B Gene on Chromosome 9p21 Protects Against Coronary Artery Disease in Americans of African Ancestry , 2011, Journal of Human Genetics.

[17]  Thomas W. Mühleisen,et al.  Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease , 2011, Nature Genetics.

[18]  Carla E. Brodley,et al.  Class Imbalance, Redux , 2011, 2011 IEEE 11th International Conference on Data Mining.

[19]  V. Corces,et al.  Enhancer function: new insights into the regulation of tissue-specific gene expression , 2011, Nature Reviews Genetics.

[20]  H. Vardhan,et al.  Chlamydia pneumoniae heat shock protein 60 is associated with apoptotic signaling pathway in human atheromatous plaques of coronary artery disease patients. , 2011, Journal of cardiology.

[21]  Lincoln Stein,et al.  Reactome: a database of reactions, pathways and biological processes , 2010, Nucleic Acids Res..

[22]  J. Olefsky,et al.  The cellular and signaling networks linking the immune system and metabolism in disease , 2012, Nature Medicine.

[23]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[24]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[25]  J. Dekker,et al.  The long-range interaction landscape of gene promoters , 2012, Nature.

[26]  Navdeep Jaitly,et al.  Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[27]  M. Peters,et al.  Systematic identification of trans eQTLs as putative drivers of known disease associations , 2013, Nature Genetics.

[28]  Joris Driesen,et al.  Lightly supervised automatic subtitling of weather forecasts , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[29]  E. Levanon,et al.  Human housekeeping genes, revisited. , 2013, Trends in genetics : TIG.

[30]  A. Auton,et al.  Candidate genes and functional noncoding variants identified in a canine model of obsessive-compulsive disorder , 2014, Genome Biology.

[31]  R. K. Vangala,et al.  Novel network biomarkers profile based coronary artery disease risk stratification in Asian Indians , 2013, Advanced biomedical research.

[32]  T. Meehan,et al.  An atlas of active enhancers across human cell types and tissues , 2014, Nature.

[33]  A. Chawla,et al.  Metabolic regulation of immune responses. , 2014, Annual review of immunology.

[34]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[35]  Michael Q. Zhang,et al.  Genome-wide map of regulatory interactions in the human genome , 2014, Genome research.

[36]  Manolis Kellis,et al.  Deep learning for regulatory genomics , 2015, Nature Biotechnology.

[37]  Christie S. Chang,et al.  The BioGRID interaction database: 2015 update , 2014, Nucleic Acids Res..

[38]  Philip A. Ewels,et al.  Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C , 2015, Nature Genetics.

[39]  Sally Temple,et al.  A Systematic Approach to Identify Candidate Transcription Factors that Control Cell Identity , 2015, Stem cell reports.

[40]  R. Jiang Walking on multiple disease-gene networks to prioritize candidate genes. , 2015, Journal of molecular cell biology.

[41]  C. Winborn,et al.  Genomic regulation of senescence and innate immunity signaling in the retinal pigment epithelium , 2015, Mammalian Genome.

[42]  Dariusz M Plewczynski,et al.  CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription , 2015, Cell.

[43]  Alessandro Vullo,et al.  Ensembl 2015 , 2014, Nucleic Acids Res..

[44]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[45]  Wei Wang,et al.  Constructing 3D interaction maps from 1D epigenomes , 2016, Nature Communications.

[46]  David R. Kelley,et al.  Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks , 2015, bioRxiv.

[47]  Jonathan M. Cairns,et al.  CHiCAGO: robust detection of DNA looping interactions in Capture Hi-C data , 2015, Genome Biology.

[48]  Daniel Marbach,et al.  Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics , 2016, PLoS Comput. Biol..

[49]  Jonathan M. Cairns,et al.  Lineage-Specific Genome Architecture Links Enhancers and Non-coding Disease Variants to Target Gene Promoters , 2016, Cell.

[50]  K. Pollard,et al.  Enhancer–promoter interactions are encoded by complex genomic signatures on looping chromatin , 2016, Nature Genetics.

[51]  A. McKenna,et al.  CRISPR/Cas9-Mediated Scanning for Regulatory Elements Required for HPRT1 Expression via Thousands of Large, Programmed Genomic Deletions. , 2017, American journal of human genetics.

[52]  W. Wong,et al.  Modeling gene regulation from paired expression and chromatin accessibility data , 2017, Proceedings of the National Academy of Sciences.

[53]  R. Jiang,et al.  Gene co-opening network deciphers gene functional relationships. , 2017, Molecular bioSystems.

[54]  Michael P Snyder,et al.  ChIA-PET2: a versatile and flexible pipeline for ChIA-PET data analysis , 2016, Nucleic acids research.

[55]  Lan T M Dao,et al.  Genome-wide characterization of mammalian promoters with distal enhancer functions , 2017, Nature Genetics.

[56]  Ellen Schofield,et al.  Chromosome contacts in activated T cells identify autoimmune disease candidate genes , 2017, Genome Biology.

[57]  B. Li,et al.  A tiling1deletion based genetic screen for cis-regulatory element identification in mammalian cells , 2017, Nature Methods.

[58]  Dariusz M Plewczynski,et al.  Three-dimensional Epigenome Statistical Model: Genome-wide Chromatin Looping Prediction , 2018, Scientific Reports.

[59]  Bo Zhang,et al.  Enhancing Hi-C data resolution with deep convolutional neural network HiCPlus , 2018, Nature Communications.