A multi-task graph-clustering approach for chromosome conformation capture data sets identifies conserved modules of chromosomal interactions

Chromosome conformation capture methods are being increasingly used to study three-dimensional genome architecture in multiple cell types and species. An important challenge is to examine changes in three-dimensional architecture across cell types and species. We present Arboretum-Hi-C, a multi-task spectral clustering method, to identify common and context-specific aspects of genome architecture. Compared to standard clustering, Arboretum-Hi-C produced more biologically consistent patterns of conservation. Most clusters are conserved and enriched for either high- or low-activity genomic signals. Most genomic regions diverge between clusters with similar chromatin state except for a few that are associated with lamina-associated domains and open chromatin.

[1]  A. Tanay,et al.  Three-Dimensional Folding and Functional Organization Principles of the Drosophila Genome , 2012, Cell.

[2]  William Stafford Noble,et al.  Joint annotation of chromatin state and chromatin conformation reveals relationships among domain types and identifies domains of cell type-specific expression , 2014, bioRxiv.

[3]  A. Tanay,et al.  Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture , 2011, Nature Genetics.

[4]  D. Odom,et al.  Comparative Hi-C Reveals that CTCF Underlies Evolution of Chromosomal Domain Architecture , 2015, Cell reports.

[5]  L. Mirny,et al.  Iterative Correction of Hi-C Data Reveals Hallmarks of Chromosome Organization , 2012, Nature Methods.

[6]  William Stafford Noble,et al.  A Three-Dimensional Model of the Yeast Genome , 2010, Nature.

[7]  Gos Micklem,et al.  Supporting Online Material Materials and Methods Figs. S1 to S50 Tables S1 to S18 References Identification of Functional Elements and Regulatory Circuits by Drosophila Modencode , 2022 .

[8]  C. Nusbaum,et al.  Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements. , 2006, Genome research.

[9]  Aidong Zhang,et al.  Cluster analysis for gene expression data: a survey , 2004, IEEE Transactions on Knowledge and Data Engineering.

[10]  C. A. M. Semple,et al.  Divergence of Mammalian Higher Order Chromatin Structure Is Associated with Developmental Loci , 2013, PLoS Comput. Biol..

[11]  Manolis Kellis,et al.  Constitutive nuclear lamina–genome interactions are highly conserved and associated with A/T-rich sequence , 2013, Genome research.

[12]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[13]  William Stafford Noble,et al.  Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts , 2014, Genome research.

[14]  W. D. Laat,et al.  A Decade of 3c Technologies: Insights into Nuclear Organization References , 2022 .

[15]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[16]  Jennifer A. Mitchell,et al.  Preferential associations between co-regulated genes reveal a transcriptional interactome in erythroid cells , 2010, Nature Genetics.

[17]  E. Liu,et al.  An Oestrogen Receptor α-bound Human Chromatin Interactome , 2009, Nature.

[18]  S. Dalton,et al.  Evolutionarily conserved replication timing profiles predict long-range chromatin interactions and distinguish closely related cell types. , 2010, Genome research.

[19]  J. Dekker,et al.  The long-range interaction landscape of gene promoters , 2012, Nature.

[20]  L. Mirny,et al.  Exploring the three-dimensional organization of genomes: interpreting chromatin interaction data , 2013, Nature Reviews Genetics.

[21]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[22]  Jesse R. Dixon,et al.  Topological Domains in Mammalian Genomes Identified by Analysis of Chromatin Interactions , 2012, Nature.

[23]  William Stafford Noble,et al.  Analysis methods for studying the 3D architecture of the genome , 2015, Genome Biology.

[24]  J. Dekker,et al.  Capturing Chromosome Conformation , 2002, Science.

[25]  Bernadett Papp,et al.  Genome-wide dynamics of replication timing revealed by in vitro models of mouse embryogenesis. , 2010, Genome research.

[26]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[27]  Howard Y. Chang,et al.  A long noncoding RNA maintains active chromatin to coordinate homeotic gene expression , 2011, Nature.

[28]  Yan Li,et al.  A high-resolution map of three-dimensional chromatin interactome in human cells , 2013, Nature.

[29]  B. Steensel,et al.  Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture–on-chip (4C) , 2006, Nature Genetics.

[30]  Michelle Girvan,et al.  Topological properties of chromosome conformation graphs reflect spatial proximities within chromatin , 2013, BCB.

[31]  Manolis Kellis,et al.  Arboretum: Reconstruction and analysis of the evolutionary history of condition-specific transcriptional modules , 2013, Genome research.

[32]  Benjamin Leblanc,et al.  Polycomb-Dependent Regulatory Contacts between Distant Hox Loci in Drosophila , 2011, Cell.

[33]  William Stafford Noble,et al.  On the assessment of statistical significance of three-dimensional colocalization of sets of genomic elements , 2012, Nucleic acids research.

[34]  Raymond K. Auerbach,et al.  Extensive Promoter-Centered Chromatin Interactions Provide a Topological Basis for Transcription Regulation , 2012, Cell.

[35]  Rich Caruana,et al.  Multitask Learning , 1997, Machine Learning.

[36]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[37]  Hideki Tanizawa,et al.  Mapping of long-range associations throughout the fission yeast genome reveals global genome organization linked to transcriptional regulation , 2010, Nucleic acids research.

[38]  Reza Kalhor,et al.  Genome architectures revealed by tethered chromosome conformation capture and population-based modeling , 2011, Nature Biotechnology.

[39]  Jing Liang,et al.  Chromatin architecture reorganization during stem cell differentiation , 2015, Nature.

[40]  Moritz Herrmann,et al.  Comparative analysis of metazoan chromatin organization , 2014, Nature.

[41]  Michael Q. Zhang,et al.  Genome-wide map of regulatory interactions in the human genome , 2014, Genome research.

[42]  D. Odom,et al.  CTCF and Cohesin: Linking Gene Regulatory Elements with Their Targets , 2013, Cell.

[43]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[44]  Brian T. Lee,et al.  The UCSC Genome Browser database: 2015 update , 2014, Nucleic Acids Res..

[45]  William Stafford Noble,et al.  Fine-scale chromatin interaction maps reveal the cis-regulatory landscape of human lincRNA genes , 2014, Nature Methods.

[46]  David Haussler,et al.  The UCSC Genome Browser database: update 2010 , 2009, Nucleic Acids Res..

[47]  Michael Q. Zhang,et al.  Integrative analysis of 111 reference human epigenomes , 2015, Nature.

[48]  Bin Yu,et al.  Co-clustering for directed graphs: the Stochastic co-Blockmodel and spectral algorithm Di-Sim , 2012, 1204.2296.

[49]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[50]  Chuan He,et al.  Fate by RNA methylation: m6A steers stem cell pluripotency , 2015, Genome Biology.

[51]  Terrence S. Furey,et al.  The UCSC Genome Browser Database: update 2006 , 2005, Nucleic Acids Res..