Systematic Evaluation of Statistical Methods for Identifying Looping Interactions in 5C Data.

Chromosome-Conformation-Capture-Carbon-Copy (5C) is a molecular technology based on proximity ligation that enables high-resolution and high-coverage inquiry of long-range looping interactions. Computational pipelines for analyzing 5C data involve a series of interdependent normalization procedures and statistical methods that markedly influence downstream biological results. A detailed analysis of the trade-offs inherent to all stages of 5C data analysis has not been reported. Here, we provide a comparative assessment of method performance at each step in the 5C analysis pipeline, including sequencing depth and library complexity correction, bias mitigation, spatial noise reduction, distance-dependent expected and variance estimation, statistical modeling, and loop detection. We discuss methodological advantages and disadvantages at each step and provide a full suite of algorithms, lib5C, to allow investigators to test the range of approaches on their own 5C data. Principles learned from our comparative analyses can be applied to protein-independent proximity ligation-based data, including Hi-C, 4C, and Capture-C.

[1]  A. Tanay,et al.  Multiscale 3D Genome Rewiring during Mouse Neural Development , 2017, Cell.

[2]  R. Young,et al.  Histone H3K27ac separates active from poised enhancers and predicts developmental state , 2010, Proceedings of the National Academy of Sciences.

[3]  L. Mirny,et al.  Iterative Correction of Hi-C Data Reveals Hallmarks of Chromosome Organization , 2012, Nature Methods.

[4]  Victor O. Leshyk,et al.  The 4D nucleome project , 2017, Nature.

[5]  L. Mirny,et al.  Formation of Chromosomal Domains in Interphase by Loop Extrusion , 2015, bioRxiv.

[6]  Jennifer E. Phillips-Cremins,et al.  5C-ID: Increased resolution Chromosome-Conformation-Capture-Carbon-Copy with in situ 3C and double alternating primer design. , 2018, Methods.

[7]  K. Hansen,et al.  Removing technical variability in RNA-seq data using conditional quantile normalization , 2012, Biostatistics.

[8]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[9]  Daniel S. Day,et al.  Activation of proto-oncogenes by disruption of chromosome neighborhoods , 2015, Science.

[10]  J. Sedat,et al.  Spatial partitioning of the regulatory landscape of the X-inactivation centre , 2012, Nature.

[11]  C. Ponting,et al.  Sequencing depth and coverage: key considerations in genomic analyses , 2014, Nature Reviews Genetics.

[12]  Jean-Philippe Vert,et al.  HiC-Pro: an optimized and flexible pipeline for Hi-C data processing , 2015, Genome Biology.

[13]  J. Dekker,et al.  The long-range interaction landscape of gene promoters , 2012, Nature.

[14]  Neva C. Durand,et al.  Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes , 2015, Proceedings of the National Academy of Sciences.

[15]  B. Wold,et al.  Large-Scale Quality Analysis of Published ChIP-seq Data , 2013, G3: Genes, Genomes, Genetics.

[16]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[17]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[18]  C. Nusbaum,et al.  Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements. , 2006, Genome research.

[19]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[20]  Jennifer E. Phillips-Cremins,et al.  Architectural Protein Subclasses Shape 3D Organization of Genomes during Lineage Commitment , 2013, Cell.

[21]  W. Sung,et al.  ChIA-PET tool for comprehensive chromatin interaction analysis with paired-end tag sequencing , 2010, Genome Biology.

[22]  Jesse R. Dixon,et al.  Topological Domains in Mammalian Genomes Identified by Analysis of Chromatin Interactions , 2012, Nature.

[23]  N. Rhind,et al.  DNA replication timing. , 2013, Cold Spring Harbor perspectives in biology.

[24]  Neva C. Durand,et al.  Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. , 2016, Cell systems.

[25]  Thomas G. Gilgenast,et al.  Local Genome Topology Can Exhibit an Incompletely Rewired 3D-Folding State during Somatic Cell Reprogramming. , 2016, Cell stem cell.

[26]  Jennifer E. Phillips-Cremins,et al.  YY1 and CTCF orchestrate a 3D chromatin looping switch during early neural lineage commitment , 2017, Genome research.

[27]  G. Blobel,et al.  Manipulating nuclear architecture. , 2014, Current opinion in genetics & development.

[28]  Clifford A. Meyer,et al.  Model-based Analysis of ChIP-Seq (MACS) , 2008, Genome Biology.

[29]  A. Tanay,et al.  Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture , 2011, Nature Genetics.

[30]  Dariusz M Plewczynski,et al.  CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription , 2015, Cell.

[31]  James Taylor,et al.  HiFive: a tool suite for easy and efficient HiC and 5C data analysis , 2014, Genome Biology.

[32]  J. Dekker,et al.  Condensin-Driven Remodeling of X-Chromosome Topology during Dosage Compensation , 2015, Nature.

[33]  Yan Li,et al.  A high-resolution map of three-dimensional chromatin interactome in human cells , 2013, Nature.

[34]  Suchit Jhunjhunwala,et al.  Chromatin Architecture and the Generation of Antigen Receptor Diversity , 2009, Cell.