Accurate loop calling for 3D genomic data with cLoops

Sequencing-based 3D genome mapping technologies can identify loops formed by interactions between regulatory elements hundreds of kilobases apart. Existing loop-calling tools are mostly restricted to a single data type, with accuracy dependent on a pre-defined resolution contact matrix or called peaks, and can have prohibitive hardware costs. Here we introduce cLoops (‘see loops’) to address these limitations. cLoops is based on the clustering algorithm cDBSCAN that directly analyzes the paired-end tags (PETs) to find candidate loops and uses a permuted local background to estimate statistical significance. These two data-type-independent processes enable loops to be reliably identified for both sharp and broad peak data, including but not limited to ChIA-PET, Hi-C, HiChIP and Trac-looping data. Loops identified by cLoops showed much less distance-dependent bias and higher enrichment relative to local regions than existing tools. Altogether, cLoops improves accuracy of detecting of 3D-genomic loops from sequencing data, is versatile, flexible, efficient, and has modest hardware requirements, and is freely available at: https://github.com/YaqiangCao/cLoops.

[1]  Borbala Mifsud,et al.  GOTHiC, a probabilistic model to resolve complex biases and to identify real interactions in Hi-C data , 2017, PloS one.

[2]  Philip A. Ewels,et al.  HiCUP: pipeline for mapping and processing Hi-C data , 2015, F1000Research.

[3]  Martin J. Aryee,et al.  hichipper: a preprocessing pipeline for calling DNA loops from HiChIP data , 2018, Nature Methods.

[4]  Michael P Snyder,et al.  ChIA-PET2: a versatile and flexible pipeline for ChIA-PET data analysis , 2016, Nucleic acids research.

[5]  Job Dekker,et al.  Mapping the 3D genome: Aiming for consilience , 2016, Nature Reviews Molecular Cell Biology.

[6]  S. Bicciato,et al.  Comparison of computational methods for Hi-C data analysis , 2017, Nature Methods.

[7]  Qi Zheng,et al.  HIPPIE: a high-throughput identification pipeline for promoter interacting enhancer elements , 2015, Bioinform..

[8]  Neva C. Durand,et al.  Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. , 2016, Cell systems.

[9]  Michael P. Snyder,et al.  Mango: a bias-correcting ChIA-PET analysis pipeline , 2015, Bioinform..

[10]  William Stafford Noble,et al.  Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts , 2014, Genome research.

[11]  Eivind Hovig,et al.  A statistical model of ChIA-PET data for accurate detection of chromatin 3D interactions , 2014, Nucleic acids research.

[12]  K. Zhao,et al.  Characterization of genome-wide enhancer-promoter interactions reveals co-expression of interacting genes and modes of higher order chromatin organization , 2012, Cell Research.

[13]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[14]  Howard Y. Chang,et al.  Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position , 2013, Nature Methods.

[15]  Jean-Philippe Vert,et al.  HiC-Pro: an optimized and flexible pipeline for Hi-C data processing , 2015, Genome Biology.

[16]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[17]  Clifford A. Meyer,et al.  Model-based Analysis of ChIP-Seq (MACS) , 2008, Genome Biology.

[18]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[19]  Howard Y. Chang,et al.  HiChIP: efficient and sensitive analysis of protein-directed genome architecture , 2016, Nature Methods.

[20]  M. Cugmas,et al.  On comparing partitions , 2015 .

[21]  Hong-yu Zhang,et al.  Structural heterogeneity and functional diversity of topologically associating domains in mammalian genomes , 2015, Nucleic acids research.

[22]  C. Glass,et al.  Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. , 2010, Molecular cell.

[23]  Michael Q. Zhang,et al.  MICC: an R package for identifying chromatin interactions from ChIA-PET data , 2015, Bioinform..

[24]  E. Liu,et al.  An Oestrogen Receptor α-bound Human Chromatin Interactome , 2009, Nature.

[25]  Aaron T. L. Lun,et al.  diffHic: a Bioconductor package to detect differential genomic interactions in Hi-C data , 2015, BMC Bioinformatics.

[26]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[27]  Bing Ren,et al.  The Three-Dimensional Organization of Mammalian Genomes. , 2017, Annual review of cell and developmental biology.

[28]  James T. Robinson,et al.  Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom. , 2016, Cell systems.

[29]  Dariusz M Plewczynski,et al.  CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription , 2015, Cell.

[30]  Kairong Cui,et al.  Trac-looping measures genome structure and chromatin accessibility , 2018, Nature Methods.