BHi-Cect: a top-down algorithm for identifying the multi-scale hierarchical structure of chromosomes

Abstract High-throughput chromosome conformation capture (Hi-C) technology enables the investigation of genome-wide interactions among chromosome loci. Current algorithms focus on topologically associating domains (TADs), that are contiguous clusters along the genome coordinate, to describe the hierarchical structure of chromosomes. However, high resolution Hi-C displays a variety of interaction patterns beyond what current TAD detection methods can capture. Here, we present BHi-Cect, a novel top-down algorithm that finds clusters by considering every locus with no assumption of genomic contiguity using spectral clustering. Our results reveal that the hierarchical structure of chromosome is organized as ‘enclaves’, which are complex interwoven clusters at both local and global scales. We show that the nesting of local clusters within global clusters characterizing enclaves, is associated with the epigenomic activity found on the underlying DNA. Furthermore, we show that the hierarchical nesting that links different enclaves integrates their respective function. BHi-Cect provides means to uncover the general principles guiding chromatin architecture.

[1]  James C. Hu,et al.  The Gene Ontology Resource: 20 years and still GOing strong , 2019 .

[2]  The Gene Ontology Consortium,et al.  The Gene Ontology Resource: 20 years and still GOing strong , 2018, Nucleic Acids Res..

[3]  Danielle S Bassett,et al.  Detecting hierarchical genome folding with network modularity , 2018, Nature Methods.

[4]  L. Mirny,et al.  Chromatin organization by an interplay of loop extrusion and compartmental segregation , 2017, Proceedings of the National Academy of Sciences.

[5]  Kelin Xia Sequence-based Multiscale Model (SeqMM) for High-throughput chromosome conformation capture (Hi-C) data analysis , 2017 .

[6]  Sébastien Phan,et al.  ChromEMT: Visualizing 3D chromatin structure and compaction in interphase and mitotic cells , 2017, Science.

[7]  Pierre Borgnat,et al.  Multi-scale structural community organisation of the human genome , 2017, BMC Bioinformatics.

[8]  Daniel Jost,et al.  IC-Finder: inferring robustly the hierarchical organization of chromatin folding , 2017, Nucleic acids research.

[9]  Ivet Bahar,et al.  Chromosomal dynamics predicted by an elastic network model explains genome-wide accessibility and long-range couplings , 2016, bioRxiv.

[10]  Alfred O. Hero,et al.  Spectral identification of topological domains , 2016, Bioinform..

[11]  James T. Robinson,et al.  Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom. , 2016, Cell systems.

[12]  Jesse R. Dixon,et al.  Chromatin Domains: The Unit of Chromosome Organization. , 2016, Molecular cell.

[13]  Benjamin J. Raphael,et al.  Identification of hierarchical chromatin domains , 2016, Bioinform..

[14]  Sushmita Roy,et al.  A multi-task graph-clustering approach for chromosome conformation capture data sets identifies conserved modules of chromosomal interactions , 2016, Genome Biology.

[15]  S. Q. Xie,et al.  Hierarchical folding and reorganization of chromosomes are linked to transcriptional changes in cellular differentiation , 2015, Molecular systems biology.

[16]  Eric S. Lander,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2015, Cell.

[17]  P. Wolynes,et al.  Topology, structures, and energy landscapes of human chromosomes , 2015, Proceedings of the National Academy of Sciences.

[18]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[19]  Rubén J. Sánchez-García,et al.  Hierarchical Spectral Clustering of Power Grids , 2014, IEEE Transactions on Power Systems.

[20]  Robert Patro,et al.  Identification of alternative topological domains in chromatin , 2014, Algorithms for Molecular Biology.

[21]  L. Mirny,et al.  Exploring the three-dimensional organization of genomes: interpreting chromatin interaction data , 2013, Nature Reviews Genetics.

[22]  J. Dekker,et al.  Hi-C: a comprehensive technique to capture the conformation of genomes. , 2012, Methods.

[23]  J. Sedat,et al.  Spatial partitioning of the regulatory landscape of the X-inactivation centre , 2012, Nature.

[24]  Jesse R. Dixon,et al.  Topological Domains in Mammalian Genomes Identified by Analysis of Chromatin Interactions , 2012, Nature.

[25]  A. Tanay,et al.  Three-Dimensional Folding and Functional Organization Principles of the Drosophila Genome , 2012, Cell.

[26]  Manolis Kellis,et al.  ChromHMM: automating chromatin-state discovery and characterization , 2012, Nature Methods.

[27]  J. Brickner,et al.  Compartmentalization of the nucleus. , 2011, Trends in cell biology.

[28]  L. Mirny The fractal globule as a model of chromatin architecture in the cell , 2011, Chromosome Research.

[29]  William N. Venables,et al.  Modern Applied Statistics with S , 2010 .

[30]  T. Mikkelsen,et al.  The NIH Roadmap Epigenomics Mapping Consortium , 2010, Nature Biotechnology.

[31]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[32]  Daniel Ruiz,et al.  A Fast Algorithm for Matrix Balancing , 2013, Web Information Retrieval and Linear Algebra Algorithms.

[33]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[34]  Bradley R. Cairns,et al.  Chromatin remodelling: the industrial revolution of DNA around histones , 2006, Nature Reviews Molecular Cell Biology.

[35]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Paul T. Groth,et al.  The ENCODE (ENCyclopedia Of DNA Elements) Project , 2004, Science.

[37]  W. Loh Box–Cox Transformations , 2004 .

[38]  Ulrike von Luxburg,et al.  On the Convergence of Spectral Clustering on Random Samples: The Normalized Case , 2004, COLT.

[39]  Marina Meila,et al.  Comparing Clusterings by the Variation of Information , 2003, COLT.

[40]  Erik Splinter,et al.  Looping and interaction between hypersensitive sites in the active beta-globin locus. , 2002, Molecular cell.

[41]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[42]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[43]  D. Cox,et al.  An Analysis of Transformations , 1964 .

[44]  H. Hirschfeld A Connection between Correlation and Contingency , 1935, Mathematical Proceedings of the Cambridge Philosophical Society.