Retrieving high-resolution chromatin interactions and decoding enhancer regulatory potential in silico

The advent of the chromosome conformation capture (3C) and related technologies has profoundly renewed our understaning of three-dimensional chromatin organization in mammalian nuclei. Alongside these experimental approaches, numerous computational tools for handling, normalizing, visualizing, and ultimately detecting interactions in 3C-type datasets are being developed. Here, we present Bloom, a comprehensive method for the analysis of 3C-type data matrices on the basis of Dirichlet process mixture models that addresses two important open issues. First, it retrieves occult interaction patterns from sparse data, like those derived from single-cell Hi-C experiments; thus, bloomed sparse data can now be used to study interaction landscapes at sub-kbp resolution. Second, it detects enhancer-promoter interactions with high sensitivity and inherently assigns an interaction frequency score (IFS) to each contact. Using enhancer perturbation data of different throughput, we show that IFS accurately quantifies the regulatory influence of each enhancer on its target promoter. As a result, Bloom allows decoding of complex regulatory landscapes by generating functionally-relevant enhancer atlases solely on the basis of 3C-type of data.

[1]  Raymond K. Auerbach,et al.  Extensive Promoter-Centered Chromatin Interactions Provide a Topological Basis for Transcription Regulation , 2012, Cell.

[2]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[3]  Nezar Abdennur,et al.  Cooler: scalable storage for Hi-C data and other genomically-labeled arrays , 2019, bioRxiv.

[4]  Characterizing the 3D structure and dynamics of chromosomes and proteins in a common contact matrix framework , 2018, Nucleic acids research.

[5]  D. Dickel,et al.  Loss of Extreme Long-Range Enhancers in Human Neural Crest Drives a Craniofacial Disorder , 2020, Cell stem cell.

[6]  Neva C. Durand,et al.  Activity-by-Contact model of enhancer-promoter regulation from thousands of CRISPR perturbations , 2019, Nature Genetics.

[7]  Sandhya Prabhakaran,et al.  Dirichlet Process Mixture Model for Correcting Technical Variation in Single-Cell Gene Expression Data , 2016, ICML.

[8]  E. E. Osborne On pre-conditioning matrices , 1959, ACM '59.

[9]  Kin Chung Lam,et al.  High-resolution TADs reveal DNA sequences underlying genome organization in flies , 2017, Nature Communications.

[10]  Anthony D. Schmitt,et al.  Genome-wide mapping and analysis of chromosome architecture , 2016, Nature Reviews Molecular Cell Biology.

[11]  William Stafford Noble,et al.  Dynamics of genome reorganization during human cardiogenesis reveal an RBM20-dependent splicing factory , 2019, Nature Communications.

[12]  Erez Lieberman Aiden,et al.  Cohesin Loss Eliminates All Loop Domains , 2017, Cell.

[13]  M. Peifer,et al.  yylncT Defines a Class of Divergently Transcribed lncRNAs and Safeguards the T-mediated Mesodermal Commitment of Human PSCs. , 2019, Cell stem cell.

[14]  E. Gusmão,et al.  Analysis of computational footprinting methods for DNase sequencing experiments , 2016, Nature Methods.

[15]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[16]  L. Mirny,et al.  Iterative Correction of Hi-C Data Reveals Hallmarks of Chromosome Organization , 2012, Nature Methods.

[17]  Patrick F. Sullivan,et al.  Robust Hi-C Maps of Enhancer-Promoter Interactions Reveal the Function of Non-coding Genome in Neural Development and Diseases. , 2020, Molecular cell.

[18]  Jun S. Liu,et al.  Quality control, modeling, and visualization of CRISPR screens with MAGeCK-VISPR , 2015, Genome Biology.

[19]  J. Michael Cherry,et al.  The Encyclopedia of DNA elements (ENCODE): data portal update , 2017, Nucleic Acids Res..

[20]  Sharon R Grossman,et al.  Systematic mapping of functional enhancer–promoter connections with CRISPR interference , 2016, Science.

[21]  C. Reinsch,et al.  Balancing a matrix for calculation of eigenvalues and eigenvectors , 1969 .

[22]  Jacob M. Schreiber,et al.  A Genome-wide Framework for Mapping Gene Regulation via Cellular Genetic Screens , 2019, Cell.

[23]  Leonid A. Mirny,et al.  Ultrastructural details of mammalian chromosome architecture , 2019, bioRxiv.

[24]  Mark H Johnson,et al.  The development of spatial frequency biases in face recognition. , 2010, Journal of experimental child psychology.

[25]  Jordan L. Boyd-Graber,et al.  Dirichlet Mixtures, the Dirichlet Process, and the Structure of Protein Space , 2013, J. Comput. Biol..

[26]  Juan M. Vaquerizas,et al.  Cohesin Disrupts Polycomb-Dependent Chromosome Interactions in Embryonic Stem Cells , 2020, Cell reports.

[27]  Bing He,et al.  Identifying topologically associating domains and subdomains by Gaussian Mixture model And Proportion test , 2017, Nature Communications.

[28]  Aaron T. L. Lun,et al.  diffHic: a Bioconductor package to detect differential genomic interactions in Hi-C data , 2015, BMC Bioinformatics.

[29]  Jiang Qian,et al.  EnhancerAtlas 2.0: an updated resource with enhancer annotation in 586 tissue/cell types across nine species , 2019, Nucleic Acids Res..

[30]  Jorge Cadima,et al.  Principal component analysis: a review and recent developments , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[31]  Cesare Furlanello,et al.  A promoter-level mammalian expression atlas , 2015 .

[32]  D. Chakrabarti,et al.  A fast fixed - point algorithm for independent component analysis , 1997 .

[33]  Ferhat Ay,et al.  Identifying statistically significant chromatin contacts from Hi-C data with FitHiC2 , 2020, Nature Protocols.

[34]  B. Tabak,et al.  Higher-Order Inter-chromosomal Hubs Shape 3D Genome Organization in the Nucleus , 2018, Cell.

[35]  S. Q. Xie,et al.  Complex multi-enhancer contacts captured by Genome Architecture Mapping (GAM) , 2017, Nature.

[36]  Yun Zhu,et al.  The pluripotent genome in three dimensions is shaped around pluripotency factors , 2013, Nature.

[37]  W. V. van IJcken,et al.  Distinct IL‐1α‐responsive enhancers promote acute and coordinated changes in chromatin topology in a hierarchical manner , 2019, The EMBO journal.

[38]  Borbala Mifsud,et al.  GOTHiC, a probabilistic model to resolve complex biases and to identify real interactions in Hi-C data , 2017, PloS one.

[39]  Daniel Ruiz,et al.  A Fast Algorithm for Matrix Balancing , 2013, Web Information Retrieval and Linear Algebra Algorithms.

[40]  Tong Liu,et al.  HiCNN: a very deep convolutional neural network to better enhance the resolution of Hi-C data , 2019, Bioinform..

[41]  Joshua D. Larkin,et al.  TNFα signals through specialized factories where responsive coding and miRNA genes are transcribed , 2012, The EMBO journal.

[42]  Jennifer E. Phillips-Cremins,et al.  On the existence and functionality of topologically associating domains , 2020, Nature Genetics.

[43]  Anders S. Hansen,et al.  Resolving the 3D landscape of transcription-linked mammalian chromatin folding , 2019, bioRxiv.

[44]  William Stafford Noble,et al.  Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts , 2014, Genome research.

[45]  F. Grosveld,et al.  Forces driving the three‐dimensional folding of eukaryotic genomes , 2018, Molecular systems biology.

[46]  Rolf Backofen,et al.  Galaxy HiCExplorer 3: a web server for reproducible Hi-C, capture Hi-C and single-cell Hi-C data analysis, quality control and visualization , 2020, Nucleic Acids Res..

[47]  Wouter de Laat,et al.  The second decade of 3C technologies: detailed insights into nuclear organization , 2016, Genes & development.

[48]  Davide Marenduzzo,et al.  Exploiting native forces to capture chromosome conformation in mammalian cell nuclei , 2016, Molecular systems biology.

[49]  Nezar Abdennur,et al.  Cooler: scalable storage for Hi-C data and other genomically labeled arrays , 2020, Bioinform..

[50]  C. Allis,et al.  The molecular hallmarks of epigenetic control , 2016, Nature Reviews Genetics.

[51]  Erez Lieberman Aiden,et al.  Analysis of Hi-C data using SIP effectively identifies loops in organisms from C. elegans to mammals , 2020, Genome research.

[52]  Bo Zhang,et al.  Enhancing Hi-C data resolution with deep convolutional neural network HiCPlus , 2018, Nature Communications.

[53]  Neva C. Durand,et al.  Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. , 2016, Cell systems.

[54]  Christian Jutten,et al.  Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..

[55]  Guo-Cheng Yuan,et al.  EZH1 mediates methylation on histone H3 lysine 27 and complements EZH2 in maintaining stem cell identity and executing pluripotency. , 2008, Molecular cell.

[56]  G. Hon,et al.  Multiplexed Engineering and Analysis of Combinatorial Enhancer Activity in Single Cells. , 2017, Molecular cell.

[57]  D. G. Lupiáñez,et al.  Order and disorder: abnormal 3D chromatin organization in human disease , 2020, Briefings in functional genomics.

[58]  B. Ren,et al.  Histone H3 Lysine 4 methyltransferases MLL3 and MLL4 Modulate Long-range Chromatin Interactions at Enhancers , 2017, bioRxiv.

[59]  F. Camargo,et al.  Regenerative Reprogramming of the Intestinal Stem Cell State via Hippo Signaling Suppresses Metastatic Colorectal Cancer. , 2020, Cell stem cell.

[60]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[61]  Jim Pitman,et al.  Poisson–Dirichlet and GEM Invariant Distributions for Split-and-Merge Transformations of an Interval Partition , 2002, Combinatorics, Probability and Computing.

[62]  X. Xie,et al.  Three-dimensional genome structures of single diploid human cells , 2018, Science.

[63]  S. Bicciato,et al.  Comparison of computational methods for Hi-C data analysis , 2017, Nature Methods.

[64]  J. Lieb,et al.  What are super-enhancers? , 2014, Nature Genetics.

[65]  A. Papantonis,et al.  iHi-C 2.0: a simple approach for mapping native spatial chromatin organisation from low cell numbers. , 2019, Methods.