NuChart: An R Package to Study Gene Spatial Neighbourhoods with Multi-Omics Annotations

Long-range chromosomal associations between genomic regions, and their repositioning in the 3D space of the nucleus, are now considered to be key contributors to the regulation of gene expression and important links have been highlighted with other genomic features involved in DNA rearrangements. Recent Chromosome Conformation Capture (3C) measurements performed with high throughput sequencing (Hi-C) and molecular dynamics studies show that there is a large correlation between colocalization and coregulation of genes, but these important researches are hampered by the lack of biologists-friendly analysis and visualisation software. Here, we describe NuChart, an R package that allows the user to annotate and statistically analyse a list of input genes with information relying on Hi-C data, integrating knowledge about genomic features that are involved in the chromosome spatial organization. NuChart works directly with sequenced reads to identify the related Hi-C fragments, with the aim of creating gene-centric neighbourhood graphs on which multi-omics features can be mapped. Predictions about CTCF binding sites, isochores and cryptic Recombination Signal Sequences are provided directly with the package for mapping, although other annotation data in bed format can be used (such as methylation profiles and histone patterns). Gene expression data can be automatically retrieved and processed from the Gene Expression Omnibus and ArrayExpress repositories to highlight the expression profile of genes in the identified neighbourhood. Moreover, statistical inferences about the graph structure and correlations between its topology and multi-omics features can be performed using Exponential-family Random Graph Models. The Hi-C fragment visualisation provided by NuChart allows the comparisons of cells in different conditions, thus providing the possibility of novel biomarkers identification. NuChart is compliant with the Bioconductor standard and it is freely available at ftp://fileserver.itb.cnr.it/nuchart.

[1]  Jennifer E. Phillips-Cremins,et al.  Chromatin insulators: linking genome organization to cellular function. , 2013, Molecular cell.

[2]  L. Duret,et al.  Recombination drives the evolution of GC-content in the human genome. , 2004, Molecular biology and evolution.

[3]  P. Cook,et al.  Transcription factories, chromatin loops, and the dysregulation of gene expression in malignancy. , 2013, Seminars in cancer biology.

[4]  K. Sandhu,et al.  Circular chromosome conformation capture (4C) uncovers extensive networks of epigenetically regulated intra- and interchromosomal interactions , 2006, Nature Genetics.

[5]  J. Dekker,et al.  Capturing Chromosome Conformation , 2002, Science.

[6]  Ivan Merelli,et al.  RSSsite: a reference database and prediction tool for the identification of cryptic Recombination Signal Sequences in human and murine genomes , 2010, Nucleic Acids Res..

[7]  Boris Lenhard,et al.  Chromatin and epigenetic features of long-range gene regulation , 2013, Nucleic acids research.

[8]  R. Schneider,et al.  Dynamics and interplay of nuclear architecture, genome organization, and gene expression. , 2007, Genes & development.

[9]  Jesse R. Dixon,et al.  Topological Domains in Mammalian Genomes Identified by Analysis of Chromatin Interactions , 2012, Nature.

[10]  B. Steensel,et al.  Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture–on-chip (4C) , 2006, Nature Genetics.

[11]  Raymond K. Auerbach,et al.  Extensive Promoter-Centered Chromatin Interactions Provide a Topological Basis for Transcription Regulation , 2012, Cell.

[12]  Joshua D. Larkin,et al.  TNFα signals through specialized factories where responsive coding and miRNA genes are transcribed , 2012, The EMBO journal.

[13]  M. Babu,et al.  A complex network framework for unbiased statistical analyses of DNA–DNA contact maps , 2012, Nucleic acids research.

[14]  J. Carroll,et al.  A co-ordinated interaction between CTCF and ER in breast cancer cells , 2011, BMC Genomics.

[15]  Mark S Handcock,et al.  networksis: A Package to Simulate Bipartite Graphs with Fixed Marginals Through Sequential Importance Sampling. , 2008, Journal of statistical software.

[16]  Fraser McBlane,et al.  Recombinase, chromosomal translocations and lymphoid neoplasia: targeting mistakes and repair failures. , 2006, DNA repair.

[17]  E. Liu,et al.  An Oestrogen Receptor α-bound Human Chromatin Interactome , 2009, Nature.

[18]  Vivek Chandra,et al.  Global changes in nuclear positioning of genes and intra- and inter-domain genomic interactions that orchestrate B cell fate , 2012, Nature immunology.

[19]  J. Dekker,et al.  The long-range interaction landscape of gene promoters , 2012, Nature.

[20]  E. Liu,et al.  Large-Scale Functional Organization of Long-Range Chromatin Interaction Networks , 2012, Cell reports.

[21]  A. Tanay,et al.  Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture , 2011, Nature Genetics.

[22]  K. Zhao,et al.  Characterization of genome-wide enhancer-promoter interactions reveals co-expression of interacting genes and modes of higher order chromatin organization , 2012, Cell Research.

[23]  G. Bernardi,et al.  Distribution of DNA methylation, CpGs, and CpG islands in human isochores. , 2010, Genomics.

[24]  V. Corces,et al.  CTCF: Master Weaver of the Genome , 2009, Cell.

[25]  Mathieu Blanchette,et al.  Chromatin conformation signatures of cellular differentiation , 2009, Genome Biology.

[26]  William Stafford Noble,et al.  A genome-wide 3C-method for characterizing the three-dimensional architectures of genomes. , 2012, Methods.

[27]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[28]  Emmanuel Barillot,et al.  HiTC - Exploration of High Throughput ’C’ experiments , 2013 .

[29]  A. Tanay,et al.  Three-Dimensional Folding and Functional Organization Principles of the Drosophila Genome , 2012, Cell.

[30]  Diego di Bernardo,et al.  Colocalization of Coregulated Genes: A Steered Molecular Dynamics Study of Human Chromosome 19 , 2013, PLoS Comput. Biol..

[31]  Ming Hu,et al.  HiCNorm: removing biases in Hi-C data via Poisson regression , 2012, Bioinform..

[32]  A. Hoffman,et al.  Epigenetics of Long-Range Chromatin Interactions , 2007, Pediatric Research.

[33]  V. Noé,et al.  Transcriptional regulation of aldo-keto reductase 1C1 in HT29 human colon cancer cells resistant to methotrexate: role in the cell cycle and apoptosis. , 2008, Biochemical pharmacology.

[34]  Erik Splinter,et al.  Looping and interaction between hypersensitive sites in the active beta-globin locus. , 2002, Molecular cell.

[35]  R Ohlsson,et al.  CTCF is a uniquely versatile transcription regulator linked to epigenetics and disease. , 2001, Trends in genetics : TIG.

[36]  Michael Hackenberg,et al.  IsoFinder: computational prediction of isochores in genome sequences , 2004, Nucleic Acids Res..

[37]  Ian X. Y. Leung,et al.  Intra- and inter-chromosomal interactions correlate with CTCF binding genome wide , 2010, Molecular systems biology.

[38]  C. Nusbaum,et al.  Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements. , 2006, Genome research.

[39]  Yan Cui,et al.  CTCFBSDB 2.0: a database for CTCF-binding sites and genome organization , 2012, Nucleic Acids Res..