HiCNN2: Enhancing the Resolution of Hi-C Data Using an Ensemble of Convolutional Neural Networks

We present a deep-learning package named HiCNN2 to learn the mapping between low-resolution and high-resolution Hi-C (a technique for capturing genome-wide chromatin interactions) data, which can enhance the resolution of Hi-C interaction matrices. The HiCNN2 package includes three methods each with a different deep learning architecture: HiCNN2-1 is based on one single convolutional neural network (ConvNet); HiCNN2-2 consists of an ensemble of two different ConvNets; and HiCNN2-3 is an ensemble of three different ConvNets. Our evaluation results indicate that HiCNN2-enhanced high-resolution Hi-C data achieve smaller mean squared error and higher Pearson’s correlation coefficients with experimental high-resolution Hi-C data compared with existing methods HiCPlus and HiCNN. Moreover, all of the three HiCNN2 methods can recover more significant interactions detected by Fit-Hi-C compared to HiCPlus and HiCNN. Based on our evaluation results, we would recommend using HiCNN2-1 and HiCNN2-3 if recovering more significant interactions from Hi-C data is of interest, and HiCNN2-2 and HiCNN if the goal is to achieve higher reproducibility scores between the enhanced Hi-C matrix and the real high-resolution Hi-C matrix.

[1]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[2]  Tong Liu,et al.  Reconstructing high-resolution chromosome three-dimensional structures by Hi-C complex networks , 2018, BMC Bioinformatics.

[3]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[5]  Mark Gerstein,et al.  HiC-spector: a matrix library for spectral and reproducibility analysis of Hi-C contact maps , 2016, bioRxiv.

[6]  L. Mirny,et al.  High-Resolution Mapping of the Spatial Organization of a Bacterial Chromosome , 2013, Science.

[7]  Dariusz M Plewczynski,et al.  CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription , 2015, Cell.

[8]  Jian Yang,et al.  Image Super-Resolution via Deep Recursive Residual Network , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  William Stafford Noble,et al.  Massively multiplex single-cell Hi-C , 2016, Nature Methods.

[10]  William Stafford Noble,et al.  Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts , 2014, Genome research.

[11]  Hao Zhu SCL: A Lattice-Based Approach to Infer Three-Dimensional Chromosome Structures from Single-Cell Hi-C Data , 2019 .

[12]  Yun Fu,et al.  Residual Dense Network for Image Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Hairong Lv,et al.  hicGAN infers super resolution Hi-C data with generative adversarial networks , 2019, Bioinform..

[14]  Ting Wang,et al.  The UCSC Genome Browser Database: update 2009 , 2008, Nucleic Acids Res..

[15]  A. Tanay,et al.  Multiscale 3D Genome Rewiring during Mouse Neural Development , 2017, Cell.

[16]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[17]  A. Tanay,et al.  Three-Dimensional Folding and Functional Organization Principles of the Drosophila Genome , 2012, Cell.

[18]  Y. Mo,et al.  TADKB: Family classification and a knowledge base of topologically associating domains , 2019, BMC Genomics.

[19]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[20]  Kyoung Mu Lee,et al.  Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jesse R. Dixon,et al.  Topological Domains in Mammalian Genomes Identified by Analysis of Chromatin Interactions , 2012, Nature.

[22]  A. Tanay,et al.  Single cell Hi-C reveals cell-to-cell variability in chromosome structure , 2013, Nature.

[23]  Terrence S. Furey,et al.  The UCSC Genome Browser Database: update 2006 , 2005, Nucleic Acids Res..

[24]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Alaa Eddin Alchalabi,et al.  Taxonomic Classification for Living Organisms Using Convolutional Neural Networks , 2017, Genes.

[26]  William Stafford Noble,et al.  A statistical approach for inferring the 3D structure of the genome , 2014, Bioinform..

[27]  Thomas S. Huang,et al.  Learning a Mixture of Deep Networks for Single Image Super-Resolution , 2016, ACCV.

[28]  Tong Liu,et al.  HiCNN: a very deep convolutional neural network to better enhance the resolution of Hi-C data , 2019, Bioinform..

[29]  Bo Zhang,et al.  Enhancing Hi-C data resolution with deep convolutional neural network HiCPlus , 2018, Nature Communications.

[30]  Ming Hu,et al.  Bayesian Inference of Spatial Organizations of Chromosomes , 2013, PLoS Comput. Biol..

[31]  Hao Zhu,et al.  SCL: a lattice-based approach to infer 3D chromosome structures from single-cell Hi-C data , 2019, Bioinform..

[32]  Tong Liu,et al.  scHiCNorm: a software package to eliminate systematic biases in single-cell Hi-C data , 2017, Bioinform..