MATCHA: Probing multi-way chromatin interaction with hypergraph representation learning.

Recent advances in ligation-free, genome-wide chromatin interaction mapping such as SPRITE and ChIA-Drop have enabled the identification of simultaneous interactions involving multiple genomic loci within the same nuclei, which are informative to delineate higher-order genome organization and gene regulation mechanisms at single-nucleus resolution. Unfortunately, computational methods for analyzing multi-way chromatin interaction data are significantly underexplored. Here we develop an algorithm, called MATCHA, based on hypergraph representation learning where multi-way chromatin interactions are represented as hyperedges. Applications to SPRITE and ChIA-Drop data suggest that MATCHA is effective to denoise the data and make de novo predictions, which greatly enhances the data quality for analyzing the properties of multi-way chromatin interactions. MATCHA provides a promising framework to significantly improve the analysis of multi-way chromatin interaction data and has the potential to offer unique insights into higher-order chromosome organization and function. MATCHA is freely available for download here: https://github.com/ma-compbio/MATCHA.

[1]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[2]  Dariusz M Plewczynski,et al.  CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription , 2015, Cell.

[3]  Terrence S. Furey,et al.  The UCSC Genome Browser Database: update 2006 , 2005, Nucleic Acids Res..

[4]  William Stafford Noble,et al.  Massively multiplex single-cell Hi-C , 2016, Nature Methods.

[5]  R. Young,et al.  Super-Enhancers in the Control of Cell Identity and Disease , 2013, Cell.

[6]  K. Pollard,et al.  Enhancer–promoter interactions are encoded by complex genomic signatures on looping chromatin , 2016, Nature Genetics.

[7]  Bo Zhang,et al.  A supervised learning framework for chromatin loop detection in genome-wide contact maps , 2019, Nature Communications.

[8]  Giacomo Cavalli,et al.  Organization and function of the 3D genome , 2016, Nature Reviews Genetics.

[9]  David L. Spector,et al.  Chromatin Dynamics and Gene Positioning , 2008, Cell.

[10]  Weiqun Peng,et al.  Predicting CTCF-mediated chromatin interactions by integrating genomic and epigenomic features , 2017, Nature Communications.

[11]  A. Pombo,et al.  Methods for mapping 3D chromosome architecture , 2019, Nature Reviews Genetics.

[12]  B. Ballester,et al.  ReMap 2020: a database of regulatory regions from an integrative analysis of Human and Arabidopsis DNA-binding sequencing experiments , 2019, Nucleic Acids Res..

[13]  Daniel Jost,et al.  TADs are 3D structural units of higher-order chromosome organization in Drosophila , 2018, Science Advances.

[14]  Kyle Xiong,et al.  Revealing Hi-C subcompartments by imputing inter-chromosomal chromatin interactions , 2019, Nature Communications.

[15]  T. Mikkelsen,et al.  The NIH Roadmap Epigenomics Mapping Consortium , 2010, Nature Biotechnology.

[16]  Bernhard Schölkopf,et al.  Learning with Hypergraphs: Clustering, Classification, and Embedding , 2006, NIPS.

[17]  Sushmita Roy,et al.  In silico prediction of high-resolution Hi-C interaction matrices , 2019, Nature Communications.

[18]  William Stafford Noble,et al.  HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient , 2017, bioRxiv.

[19]  Jian Ma,et al.  Predicting CTCF-mediated chromatin loops using CTCF-MP , 2018, bioRxiv.

[20]  B. Tabak,et al.  Higher-Order Inter-chromosomal Hubs Shape 3D Genome Organization in the Nucleus , 2018, Cell.

[21]  S. Q. Xie,et al.  Complex multi-enhancer contacts captured by Genome Architecture Mapping (GAM) , 2017, Nature.

[22]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[23]  Daniel Marbach,et al.  Tissue-specific regulatory circuits reveal variable modular perturbations across complex diseases , 2016, Nature Methods.

[24]  Jeffrey H. Chuang,et al.  MIA-Sig: multiplex chromatin interaction analysis by signal processing and statistical algorithms , 2019, Genome Biology.

[25]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[26]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[27]  E. Marco,et al.  Predicting chromatin organization using histone marks , 2015, Genome Biology.

[28]  Xiaopeng Zhu,et al.  MOCHI enables discovery of heterogeneous interactome modules in 3D nucleome. , 2020, Genome research.

[29]  Raymond K. Auerbach,et al.  Extensive Promoter-Centered Chromatin Interactions Provide a Topological Basis for Transcription Regulation , 2012, Cell.

[30]  J. Sedat,et al.  Spatial partitioning of the regulatory landscape of the X-inactivation centre , 2012, Nature.

[31]  Jesse R. Dixon,et al.  Topological Domains in Mammalian Genomes Identified by Analysis of Chromatin Interactions , 2012, Nature.

[32]  Daniel Capurso,et al.  Multiplex chromatin interactions with single-molecule precision , 2019, Nature.

[33]  E. Lander,et al.  Local regulation of gene expression by lncRNA promoters, transcription and splicing , 2016, Nature.

[34]  Guillaume J. Filion,et al.  Transcription factors and 3D genome conformation in cell-fate decisions , 2019, Nature.

[35]  J. Shendure,et al.  Mechanisms of Interplay between Transcription Factors and the 3D Genome. , 2019, Molecular cell.

[36]  Y. Ruan,et al.  ChIP‐based methods for the identification of long‐range chromatin interactions , 2009, Journal of cellular biochemistry.

[37]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .