Probing multi-way chromatin interaction with hypergraph representation learning

Advances in high-throughput mapping of 3D genome organization have enabled genome-wide characterization of chromatin interactions. However, proximity ligation based mapping approaches for pairwise chromatin interaction such as Hi-C cannot capture multi-way interactions, which are informative to delineate higher-order genome organization and gene regulation mechanisms at single-nucleus resolution. The very recent development of ligation-free chromatin interaction mapping methods such as SPRITE and ChIA-Drop has offered new opportunities to uncover simultaneous interactions involving multiple genomic loci within the same nuclei. Unfortunately, methods for analyzing multi-way chromatin interaction data are significantly underexplored. Here we develop a new computational method, called MATCHA, based on hypergraph representation learning where multi-way chromatin interactions are represented as hyperedges. Applications to SPRITE and ChIA-Drop data suggest that MATCHA is effective to denoise the data and make de novo predictions of multi-way chromatin interactions, reducing the potential false positives and false negatives from the original data. We also show that MATCHA is able to distinguish between multi-way interaction in a single nucleus and combination of pairwise interactions in a cell population. In addition, the embeddings from MATCHA reflect 3D genome spatial localization and function. MATCHA provides a promising framework to significantly improve the analysis of multi-way chromatin interaction data and has the potential to offer unique insights into higher-order chromosome organization and function.

[1]  E. Lander,et al.  Local regulation of gene expression by lncRNA promoters, transcription and splicing , 2016, Nature.

[2]  R. Young,et al.  Super-Enhancers in the Control of Cell Identity and Disease , 2013, Cell.

[3]  Daniel Capurso,et al.  Multiplex chromatin interactions with single-molecule precision , 2019, Nature.

[4]  Ruochi Zhang,et al.  Hyper-SAGNN: a self-attention based graph neural network for hypergraphs , 2019, ICLR.

[5]  Y. Ruan,et al.  ChIP‐based methods for the identification of long‐range chromatin interactions , 2009, Journal of cellular biochemistry.

[6]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[7]  Jian Ma,et al.  MOCHI enables discovery of heterogeneous interactome modules in 3D nucleome , 2019, bioRxiv.

[8]  Guillaume J. Filion,et al.  Transcription factors and 3D genome conformation in cell-fate decisions , 2019, Nature.

[9]  Claude Berge,et al.  Hypergraphs - combinatorics of finite sets , 1989, North-Holland mathematical library.

[10]  Jiawei Han,et al.  Large-Scale Embedding Learning in Heterogeneous Event Data , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[11]  Jian Ma,et al.  Predicting CTCF-mediated chromatin loops using CTCF-MP , 2018, bioRxiv.

[12]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[13]  J. Shendure,et al.  Mechanisms of Interplay between Transcription Factors and the 3D Genome. , 2019, Molecular cell.

[14]  David L. Spector,et al.  Chromatin Dynamics and Gene Positioning , 2008, Cell.

[15]  A. Pombo,et al.  Methods for mapping 3D chromosome architecture , 2019, Nature Reviews Genetics.

[16]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[17]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[18]  Dariusz M Plewczynski,et al.  CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription , 2015, Cell.

[19]  William Stafford Noble,et al.  Massively multiplex single-cell Hi-C , 2016, Nature Methods.

[20]  David Hung-Chang Du,et al.  BloomFlash: Bloom Filter on Flash-Based Storage , 2011, 2011 31st International Conference on Distributed Computing Systems.

[21]  Daniel Jost,et al.  TADs are 3D structural units of higher-order chromosome organization in Drosophila , 2018, Science Advances.

[22]  Kyle Xiong,et al.  Revealing Hi-C subcompartments by imputing inter-chromosomal chromatin interactions , 2019, Nature Communications.

[23]  E. Marco,et al.  Predicting chromatin organization using histone marks , 2015, Genome Biology.

[24]  Fei Wang,et al.  Structural Deep Embedding for Hyper-Networks , 2017, AAAI.

[25]  B. Tabak,et al.  Higher-Order Inter-chromosomal Hubs Shape 3D Genome Organization in the Nucleus , 2018, Cell.

[26]  S. Q. Xie,et al.  Complex multi-enhancer contacts captured by Genome Architecture Mapping (GAM) , 2017, Nature.

[27]  William Stafford Noble,et al.  FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[28]  William Stafford Noble,et al.  HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient , 2017, bioRxiv.

[29]  Weiqun Peng,et al.  Predicting CTCF-mediated chromatin interactions by integrating genomic and epigenomic features , 2017, Nature Communications.

[30]  K. Pollard,et al.  Enhancer–promoter interactions are encoded by complex genomic signatures on looping chromatin , 2016, Nature Genetics.

[31]  Joost van de Weijer,et al.  Mix and Match Networks: Encoder-Decoder Alignment for Zero-Pair Image Translation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Bernhard Schölkopf,et al.  Learning with Hypergraphs: Clustering, Classification, and Embedding , 2006, NIPS.

[33]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[34]  Daniel Marbach,et al.  Tissue-specific regulatory circuits reveal variable modular perturbations across complex diseases , 2016, Nature Methods.

[35]  Jeffrey H. Chuang,et al.  MIA-Sig: multiplex chromatin interaction analysis by signal processing and statistical algorithms , 2019, Genome Biology.

[36]  Deborah Chasman,et al.  In silico prediction of high-resolution Hi-C interaction matrices , 2018, Nature Communications.

[37]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[38]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[39]  Bo Zhang,et al.  A supervised learning framework for chromatin loop detection in genome-wide contact maps , 2019, Nature Communications.

[40]  Giacomo Cavalli,et al.  Organization and function of the 3D genome , 2016, Nature Reviews Genetics.

[41]  Raymond K. Auerbach,et al.  Extensive Promoter-Centered Chromatin Interactions Provide a Topological Basis for Transcription Regulation , 2012, Cell.

[42]  J. Sedat,et al.  Spatial partitioning of the regulatory landscape of the X-inactivation centre , 2012, Nature.

[43]  Jesse R. Dixon,et al.  Topological Domains in Mammalian Genomes Identified by Analysis of Chromatin Interactions , 2012, Nature.