A supervised learning framework for chromatin loop detection in genome-wide contact maps

Accurately predicting chromatin loops from genome-wide interaction matrices such as Hi-C data is critical to deepening our understanding of proper gene regulation. Current approaches are mainly focused on searching for statistically enriched dots on a genome-wide map. However, given the availability of orthogonal data types such as ChIA-PET, HiChIP, Capture Hi-C, and high-throughput imaging, a supervised learning approach could facilitate the discovery of a comprehensive set of chromatin interactions. Here, we present Peakachu, a Random Forest classification framework that predicts chromatin loops from genome-wide contact maps. We compare Peakachu with current enrichment-based approaches, and find that Peakachu identifies a unique set of short-range interactions. We show that our models perform well in different platforms, across different sequencing depths, and across different species. We apply this framework to predict chromatin loops in 56 Hi-C datasets, and release the results at the 3D Genome Browser. Predicting chromatin loops from genome-wide interaction matrices such as Hi-C data provides insight into gene regulation events. Here, the authors present Peakachu, a Random Forest classification framework that predicts chromatin loops from genome-wide contact maps, and apply it to systematically predict chromatin loops in 56 Hi-C datasets, with results available at the 3D Genome Browser.

[1]  Anjali J. Koppal,et al.  Supplementary data: Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites , 2010 .

[2]  William Stafford Noble,et al.  Integrative detection and analysis of structural variation in cancer genomes , 2018, Nature Genetics.

[3]  Michael P. Snyder,et al.  Mango: a bias-correcting ChIA-PET analysis pipeline , 2015, Bioinform..

[4]  B. Tabak,et al.  Higher-Order Inter-chromosomal Hubs Shape 3D Genome Organization in the Nucleus , 2018, Cell.

[5]  S. Q. Xie,et al.  Complex multi-enhancer contacts captured by Genome Architecture Mapping (GAM) , 2017, Nature.

[6]  Jian Peng,et al.  Reconstructing spatial organizations of chromosomes through manifold learning , 2018, Nucleic acids research.

[7]  Leonid A. Mirny,et al.  Ultrastructural details of mammalian chromosome architecture , 2019, bioRxiv.

[8]  Peter J. Park,et al.  The 4D Nucleome Project , 2017 .

[9]  Yun Li,et al.  MAPS: model-based analysis of long-range chromatin interactions from PLAC-seq and HiChIP experiments , 2019, PLoS Comput. Biol..

[10]  Danny Reinberg,et al.  CTCF-mediated topological boundaries during development foster appropriate gene regulation , 2016, Genes & development.

[11]  Neva C. Durand,et al.  Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes , 2015, Proceedings of the National Academy of Sciences.

[12]  Jonathan M. Cairns,et al.  CHiCAGO: robust detection of DNA looping interactions in Capture Hi-C data , 2015, Genome Biology.

[13]  Vera Pancaldi,et al.  ChiCMaxima: a robust and simple pipeline for detection and visualization of chromatin looping in Capture Hi-C , 2018 .

[14]  Dariusz M Plewczynski,et al.  CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription , 2015, Cell.

[15]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[16]  Erez Lieberman Aiden,et al.  Polycomb-mediated chromatin loops revealed by a subkilobase-resolution chromatin interaction map , 2017, Proceedings of the National Academy of Sciences.

[17]  Daniel S. Day,et al.  YY1 Is a Structural Regulator of Enhancer-Promoter Loops , 2017, Cell.

[18]  S. Mundlos,et al.  Structural variation in the 3D genome , 2018, Nature Reviews Genetics.

[19]  Shaun Mahony,et al.  miniMDS: 3D structural inference from high-resolution Hi-C data , 2017, bioRxiv.

[20]  Wei Wang,et al.  Constructing 3D interaction maps from 1D epigenomes , 2016, Nature Communications.

[21]  William Stafford Noble,et al.  Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts , 2014, Genome research.

[22]  Manolis Kellis,et al.  ChromHMM: automating chromatin-state discovery and characterization , 2012, Nature Methods.

[23]  L Carron,et al.  Boost-HiC: computational enhancement of long-range contacts in chromosomal contact maps , 2019, Bioinform..

[24]  K. Pollard,et al.  Detection of nonneutral substitution rates on mammalian phylogenies. , 2010, Genome research.

[25]  Jonathan M. Cairns,et al.  Robust Detection of DNA Looping Interactions in Capture HiC data , 2015 .

[26]  Jennifer E. Phillips-Cremins,et al.  Architectural Protein Subclasses Shape 3D Organization of Genomes during Lineage Commitment , 2013, Cell.

[27]  Howard Y. Chang,et al.  Enhancer connectome in primary human cells identifies target genes of disease-associated DNA elements , 2017, Nature Genetics.

[28]  Miao Yu,et al.  Mapping of long-range chromatin interactions by proximity ligation-assisted ChIP-seq , 2016, Cell Research.

[29]  Danny Reinberg,et al.  Corrigendum: CTCF-mediated topological boundaries during development foster appropriate gene regulation. , 2017, Genes & development.

[30]  Swneke D. Bailey,et al.  ZNF143 provides sequence specificity to secure chromatin interactions at gene promoters , 2015, Nature Communications.

[31]  M. Gobbi,et al.  Analysis of hundreds of cis-regulatory landscapes at high resolution in a single, high-throughput experiment , 2014, Nature Genetics.

[32]  W. Sung,et al.  ChIA-PET tool for comprehensive chromatin interaction analysis with paired-end tag sequencing , 2010, Genome Biology.

[33]  Michael Q. Zhang,et al.  Genome-wide map of regulatory interactions in the human genome , 2014, Genome research.

[34]  O. Troyanskaya,et al.  Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.

[35]  William Stafford Noble,et al.  Unsupervised pattern discovery in human chromatin structure through genomic segmentation , 2012, Nature Methods.

[36]  J. Dekker,et al.  Capturing Chromosome Conformation , 2002, Science.

[37]  Bo Zhang,et al.  Enhancing Hi-C data resolution with deep convolutional neural network HiCPlus , 2018, Nature Communications.

[38]  Neva C. Durand,et al.  Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. , 2016, Cell systems.

[39]  E. Liu,et al.  An Oestrogen Receptor α-bound Human Chromatin Interactome , 2009, Nature.

[40]  J. Dekker,et al.  The hierarchy of the 3D genome. , 2013, Molecular cell.

[41]  Shikhar Uttam,et al.  Super-Resolution Imaging of Higher-Order Chromatin Structures at Different Epigenomic States in Single Mammalian Cells , 2018, Cell reports.

[42]  Howard Y. Chang,et al.  HiChIP: efficient and sensitive analysis of protein-directed genome architecture , 2016, Nature Methods.

[43]  Stefan Schoenfelder,et al.  Long-range enhancer–promoter contacts in gene expression control , 2019, Nature Reviews Genetics.

[44]  Bing Ren,et al.  Whole-genome haplotype reconstruction using proximity-ligation and shotgun sequencing , 2013, Nature Biotechnology.

[45]  L. Mirny,et al.  Formation of Chromosomal Domains in Interphase by Loop Extrusion , 2015, bioRxiv.

[46]  S. Bicciato,et al.  Comparison of computational methods for Hi-C data analysis , 2017, Nature Methods.

[47]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[48]  Erez Lieberman Aiden,et al.  Cohesin Loss Eliminates All Loop Domains , 2017, Cell.

[49]  Sigal Shachar,et al.  HIPMap: A High-Throughput Imaging Method for Mapping Spatial Gene Positions. , 2015, Cold Spring Harbor symposia on quantitative biology.

[50]  P. Fraser,et al.  Comparison of Hi-C results using in-solution versus in-nucleus ligation , 2015, Genome Biology.

[51]  Philip A. Ewels,et al.  Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C , 2015, Nature Genetics.

[52]  Nir Friedman,et al.  Mapping Nucleosome Resolution Chromosome Folding in Yeast by Micro-C , 2015, Cell.

[53]  B. Póczos,et al.  Predicting Enhancer-Promoter Interaction from Genomic Sequence with Deep Neural Networks , 2016, bioRxiv.

[54]  A. Tanay,et al.  Multiscale 3D Genome Rewiring during Mouse Neural Development , 2017, Cell.

[55]  Kairong Cui,et al.  Trac-looping measures genome structure and chromatin accessibility , 2018, Nature Methods.

[56]  J. Michael Cherry,et al.  The Encyclopedia of DNA elements (ENCODE): data portal update , 2017, Nucleic Acids Res..