Genome reconstruction and haplotype phasing using chromosome conformation capture methodologies.

Genomic analysis of individuals or organisms is predicated on the availability of high-quality reference and genotype information. With the rapidly dropping costs of high-throughput DNA sequencing, this is becoming readily available for diverse organisms and for increasingly large populations of individuals. Despite these advances, there are still aspects of genome sequencing that remain challenging for existing sequencing methods. This includes the generation of long-range contiguity during genome assembly, identification of structural variants in both germline and somatic tissues, the phasing of haplotypes in diploid organisms and the resolution of genome sequence for organisms derived from complex samples. These types of information are valuable for understanding the role of genome sequence and genetic variation on genome function, and numerous approaches have been developed to address them. Recently, chromosome conformation capture (3C) experiments, such as the Hi-C assay, have emerged as powerful tools to aid in these challenges for genome reconstruction. We will review the current use of Hi-C as a tool for aiding in genome sequencing, addressing the applications, strengths, limitations and potential future directions for the use of 3C data in genome analysis. We argue that unique features of Hi-C experiments make this data type a powerful tool to address challenges in genome sequencing, and that future integration of Hi-C data with alternative sequencing assays will facilitate the continuing revolution in genomic analysis and genome sequencing.

[1]  D. Plewczyński,et al.  Intermingling of chromosome territories , 2019, Genes, chromosomes & cancer.

[2]  Geoffrey M. Nelson,et al.  HiNT: a computational method for detecting copy number variations and translocations from Hi-C data , 2019, Genome Biology.

[3]  Cheng-Zhong Zhang,et al.  Whole Chromosome Haplotype Phasing from Long-Range Sequencing , 2019 .

[4]  Helio A. Costa,et al.  Structural Variation Detection by Proximity Ligation from Formalin-Fixed, Paraffin-Embedded Tumor Tissue. , 2019, The Journal of molecular diagnostics : JMD.

[5]  Sergey Koren,et al.  Extended haplotype phasing of de novo genome assemblies with FALCON-Phase , 2019 .

[6]  Matthew Z. DeMaere,et al.  bin3C: exploiting Hi-C sequencing data to accurately resolve metagenome-assembled genomes , 2019, Genome Biology.

[7]  P. Cramer,et al.  The Implication of Early Chromatin Changes in X Chromosome Inactivation , 2019, Cell.

[8]  S. Lomvardas,et al.  Lhx2/Ldb1-mediated trans interactions regulate olfactory receptor choice , 2018, Nature.

[9]  Minsheng Peng,et al.  Hybrid assembly of ultra-long Nanopore reads augmented with 10x-Genomics contigs: Demonstrated with a human genome. , 2019, Genomics.

[10]  Bradd J. Haley,et al.  Assignment of virus and antimicrobial resistance genes to microbial hosts in a complex microbial community by combined long-read assembly and proximity ligation , 2018, Genome Biology.

[11]  Neva C. Durand,et al.  Hi-C yields chromosome-length scaffolds for a legume genome, Trifolium subterraneum , 2018, bioRxiv.

[12]  Sergey Koren,et al.  Improved reference genome of Aedes aegypti informs arbovirus vector control , 2018, Nature.

[13]  Nicholas A. Sinnott-Armstrong,et al.  Super-resolution chromatin tracing reveals domains and cooperative interactions in single cells , 2018, Science.

[14]  L. Hug Sizing Up the Uncultured Microbial Majority , 2018, mSystems.

[15]  William Stafford Noble,et al.  Integrative detection and analysis of structural variation in cancer genomes , 2018, Nature Genetics.

[16]  Sivan Oddes,et al.  Three invariant Hi-C interaction patterns: applications to genome assembly , 2018, bioRxiv.

[17]  Alexander Payne,et al.  BulkVis: a graphical viewer for Oxford nanopore bulk FAST5 files , 2018, Bioinform..

[18]  E. Kirkness,et al.  Comparison of phasing strategies for whole human genomes , 2018, PLoS genetics.

[19]  R. Dewhurst,et al.  Assembly of 913 microbial genomes from metagenomic sequencing of the cow rumen , 2018, Nature Communications.

[20]  J. R. Paulson,et al.  A pathway for mitotic chromosome formation , 2018, Science.

[21]  Sergey Koren,et al.  Integrating Hi-C links with assembly graphs for chromosome-scale assembly , 2018, bioRxiv.

[22]  Detlef Weigel,et al.  High contiguity Arabidopsis thaliana genome assembly with a single nanopore flow cell , 2018, Nature Communications.

[23]  Sanjit S. Batra,et al.  The Juicebox Assembly Tools module facilitates de novo assembly of mammalian genomes with chromosome-length scaffolds for under $1000 , 2018, bioRxiv.

[24]  Markus A. Grohme,et al.  The genome of S. mediterranea and the evolution of cellular core mechanisms , 2018, Nature.

[25]  Maitreya J. Dunham,et al.  Identification of a novel interspecific hybrid yeast from a metagenomic spontaneously inoculated beer sample using Hi-C , 2017, bioRxiv.

[26]  R. Davidson,et al.  Microbiome Sequencing Methods for Studying Human Diseases. , 2018, Methods in molecular biology.

[27]  N. Nagarajan,et al.  The draft genome of tropical fruit durian (Durio zibethinus) , 2017, Nature Genetics.

[28]  Zev N. Kronenberg,et al.  Hi-C deconvolution of a human gut microbiome yields high-quality draft genomes and reveals plasmid-genome interactions , 2017, bioRxiv.

[29]  Arthur Brady,et al.  Strains, functions and dynamics in the expanded Human Microbiome Project , 2017, Nature.

[30]  E. de Bruijn,et al.  Sensitive Monogenic Noninvasive Prenatal Diagnosis by Targeted Haplotyping , 2017, American journal of human genetics.

[31]  Abhijit Chakraborty,et al.  Identification of copy number variations and translocations in cancer cells from Hi-C data , 2017, bioRxiv.

[32]  Emmanuel Barillot,et al.  Effective normalization for copy number variation in Hi-C data , 2017, BMC Bioinformatics.

[33]  S. Koren,et al.  Scaffolding of long read assemblies using long range contact information , 2016, BMC Genomics.

[34]  Stefan Schoenfelder,et al.  Hi-C as a tool for precise detection and characterisation of chromosomal rearrangements and copy number variation in human tumours , 2017, Genome Biology.

[35]  A. Tanay,et al.  Cell-cycle dynamics of chromosomal organisation at single-cell resolution , 2016, Nature.

[36]  Vineet Bafna,et al.  HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies , 2017, Genome research.

[37]  Son K. Pham,et al.  Improved genome assembly of American alligator genome reveals conserved architecture of estrogen signaling. , 2017, Genome research.

[38]  John K. McCooke,et al.  A chromosome conformation capture ordered sequence of the barley genome , 2017, Nature.

[39]  Xun Xu,et al.  Genome assembly with in vitro proximity ligation data and whole-genome triplication in lettuce , 2017, Nature Communications.

[40]  Steven G. Schroeder,et al.  Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome , 2017, Nature Genetics.

[41]  Marc A Marti-Renom,et al.  Defined chromosome structure in the genome-reduced bacterium Mycoplasma pneumoniae , 2017, Nature Communications.

[42]  Ilya M. Flyamer,et al.  Single-nucleus Hi-C reveals unique chromatin reorganization at oocyte-to-zygote transition , 2017, Nature.

[43]  Andre J. Faure,et al.  3D structure of individual mammalian genomes studied by single cell Hi-C , 2017, Nature.

[44]  Lyam Baudry,et al.  Scaffolding bacterial genomes and probing host-virus interactions in gut microbiome by proximity ligation (chromosome capture) assay , 2017, Science Advances.

[45]  Niranjan Nagarajan,et al.  Fast and accurate de novo genome assembly from long uncorrected reads. , 2017, Genome research.

[46]  Jue Ruan,et al.  pBACode: a random-barcode-based high-throughput approach for BAC paired-end sequencing and physical clone mapping , 2016, Nucleic acids research.

[47]  William Stafford Noble,et al.  Massively multiplex single-cell Hi-C , 2016, Nature Methods.

[48]  Ute Roessner,et al.  The genome of Chenopodium quinoa , 2017, Nature.

[49]  Kevin A. Burns,et al.  Genome evolution in the allotetraploid frog Xenopus laevis , 2016, Nature.

[50]  Miao Yu,et al.  Mapping of long-range chromatin interactions by proximity ligation-assisted ChIP-seq , 2016, Cell Research.

[51]  Howard Y. Chang,et al.  HiChIP: efficient and sensitive analysis of protein-directed genome architecture , 2016, Nature Methods.

[52]  Zohar Yakhini,et al.  Extending partial haplotypes to full genome haplotypes using chromosome conformation capture data , 2016, Bioinform..

[53]  Georgios A. Pavlopoulos,et al.  Uncovering Earth’s virome , 2016, Nature.

[54]  Hua-Jun Wu,et al.  A computational strategy to adjust for copy number in tumor Hi-C data , 2016, Bioinform..

[55]  Daniel H. Huson,et al.  MEGAN Community Edition - Interactive Exploration and Analysis of Large-Scale Microbiome Sequencing Data , 2016, PLoS Comput. Biol..

[56]  Daisy E. Pagete An end-to-end assembly of the Aedes aegypti genome , 2016, 1605.04619.

[57]  David Haussler,et al.  Long-read sequence assembly of the gorilla genome , 2016, Science.

[58]  J. Korlach,et al.  Resolving the Complexity of Human Skin Metagenomes Using Single-Molecule Sequencing , 2016, mBio.

[59]  Jan Vrána,et al.  BioNano genome mapping of individual chromosomes supports physical mapping and sequence assembly in complex plant genomes , 2016, Plant biotechnology journal.

[60]  Hanlee P. Ji,et al.  Haplotyping germline and cancer genomes using high-throughput linked-read sequencing , 2015, Nature Biotechnology.

[61]  Po-Ru Loh,et al.  Fast and accurate long-range phasing in a UK Biobank cohort , 2015, Nature Genetics.

[62]  Brendan L. O’Connell,et al.  Chromosome-scale shotgun assembly using an in vitro method for long-range linkage , 2015, Genome research.

[63]  Jesse R. Dixon,et al.  Complete haplotype phasing of the MHC and KIR loci with targeted HaploSeq , 2015, BMC Genomics.

[64]  Romain Koszul,et al.  Contact genomics: scaffolding and phasing (meta)genomes using chromosome 3D physical signatures , 2015, FEBS letters.

[65]  Romain Koszul,et al.  Condensin- and Replication-Mediated Bacterial Chromosome Folding and Origin Condensation Revealed by Hi-C and Super-resolution Imaging. , 2015, Molecular cell.

[66]  Job Dekker,et al.  Condensin promotes the juxtaposition of DNA flanking its loading site in Bacillus subtilis , 2015, Genes & development.

[67]  N. Loman,et al.  A complete bacterial genome assembled de novo using only nanopore sequencing data , 2015, Nature Methods.

[68]  Jing Liang,et al.  Chromatin architecture reorganization during stem cell differentiation , 2015, Nature.

[69]  Michael Q. Zhang,et al.  Integrative analysis of haplotype-resolved epigenomes across human tissues , 2015, Nature.

[70]  Noam Kaplan,et al.  The Hitchhiker's guide to Hi-C analysis: practical guidelines. , 2015, Methods.

[71]  Kunihiko Sadakane,et al.  MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph , 2014, Bioinform..

[72]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[73]  Romain Koszul,et al.  Metagenomic chromosome conformation capture (meta3C) unveils the diversity of chromosome organization in microorganisms , 2014, eLife.

[74]  Antoine Margeot,et al.  High-quality genome (re)assembly using chromosomal contact data , 2014, Nature Communications.

[75]  A. Lesne,et al.  3D genome reconstruction from chromosomal contacts , 2014, Nature Methods.

[76]  Andrew C. Adey,et al.  Haplotype-resolved whole-genome sequencing by contiguity-preserving transposition and combinatorial indexing , 2014, Nature Genetics.

[77]  Peter H. L. Krijger,et al.  Targeted sequencing by proximity ligation for comprehensive variant detection and local haplotyping , 2014, Nature Biotechnology.

[78]  Jenna M. Lang,et al.  Strain- and plasmid-level deconvolution of a synthetic metagenome by sequencing proximity ligation products , 2014, PeerJ.

[79]  Maitreya J. Dunham,et al.  Species-Level Deconvolution of Metagenome Assemblies with Hi-C–Based Contact Probability Maps , 2014, G3: Genes, Genomes, Genetics.

[80]  Andrew C. Adey,et al.  Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions , 2013, Nature Biotechnology.

[81]  Bing Ren,et al.  Whole-genome haplotype reconstruction using proximity-ligation and shotgun sequencing , 2013, Nature Biotechnology.

[82]  Noam Kaplan,et al.  High-throughput genome scaffolding from in-vivo DNA interaction frequency , 2013, Nature Biotechnology.

[83]  L. Mirny,et al.  High-Resolution Mapping of the Spatial Organization of a Bacterial Chromosome , 2013, Science.

[84]  Job Dekker,et al.  Organization of the Mitotic Chromosome , 2013, Science.

[85]  Elhanan Borenstein,et al.  Reconstructing the Genomic Content of Microbiome Taxa through Shotgun Metagenomic Deconvolution , 2013, PLoS Comput. Biol..

[86]  P. Hugenholtz,et al.  Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes , 2013, Nature Biotechnology.

[87]  Jay Shendure,et al.  The haplotype-resolved genome and epigenome of the aneuploid HeLa cancer cell line , 2013, Nature.

[88]  Jian Wang,et al.  Haplotype-assisted accurate non-invasive fetal whole genome recovery through maternal plasma sequencing , 2013, Genome Medicine.

[89]  Erez Lieberman Aiden,et al.  The expanding scope of DNA sequencing , 2012, Nature Biotechnology.

[90]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[91]  Adrian W. Briggs,et al.  A High-Coverage Genome Sequence from an Archaic Denisovan Individual , 2012, Science.

[92]  Jesse M. Engreitz,et al.  Three-Dimensional Genome Architecture Influences Partner Selection for Chromosomal Translocations in Human Disease , 2012, PloS one.

[93]  R. D. Hawkins,et al.  Methods for identifying higher-order chromatin structure. , 2012, Annual review of genomics and human genetics.

[94]  A. Gnirke,et al.  Paired-end sequencing of Fosmid libraries by Illumina , 2012, Genome research.

[95]  H. C. Fan,et al.  Noninvasive Prenatal Measurement of the Fetal Genome , 2012, Nature.

[96]  Jay Shendure,et al.  Noninvasive Whole-Genome Sequencing of a Human Fetus , 2012, Science Translational Medicine.

[97]  Marc A. Martí-Renom,et al.  The Three-Dimensional Architecture of a Bacterial Genome and Its Alteration by Genetic Perturbation , 2012, RECOMB.

[98]  J. Sedat,et al.  Spatial partitioning of the regulatory landscape of the X-inactivation centre , 2012, Nature.

[99]  Jesse R. Dixon,et al.  Topological Domains in Mammalian Genomes Identified by Analysis of Chromatin Interactions , 2012, Nature.

[100]  L. Mirny,et al.  Higher-order chromatin structure: bridging physics and biology. , 2012, Current opinion in genetics & development.

[101]  B. Ren,et al.  Base-Resolution Analyses of Sequence and Parent-of-Origin Dependent DNA Methylation in the Mouse Genome , 2012, Cell.

[102]  A. Tanay,et al.  Three-Dimensional Folding and Functional Organization Principles of the Drosophila Genome , 2012, Cell.

[103]  Reza Kalhor,et al.  Genome architectures revealed by tethered chromosome conformation capture and population-based modeling , 2011, Nature Biotechnology.

[104]  A. Tanay,et al.  Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture , 2011, Nature Genetics.

[105]  David M. A. Martin,et al.  Genome sequence and analysis of the tuber crop potato , 2011, Nature.

[106]  S. Tringe,et al.  Metagenomic Discovery of Biomass-Degrading Genes and Genomes from Cow Rumen , 2011, Science.

[107]  Andrew C. Adey,et al.  Haplotype-resolved genome sequencing of a Gujarati Indian individual , 2011, Nature Biotechnology.

[108]  J. Gilbert,et al.  Microbial metagenomics: beyond the genome. , 2011, Annual review of marine science.

[109]  Marc A Marti-Renom,et al.  The Three-dimensional Architecture of a Bacterial Genome and Its Alteration by Genetic Perturbation , 2022 .

[110]  A. Gnirke,et al.  High-quality draft assemblies of mammalian genomes from massively parallel sequence data , 2010, Proceedings of the National Academy of Sciences.

[111]  Yama W. L. Zheng,et al.  Maternal Plasma DNA Sequencing Reveals the Genome-Wide Genetic and Mutational Profile of the Fetus , 2010, Science Translational Medicine.

[112]  Shibu Yooseph,et al.  Genomic and functional adaptation in surface ocean planktonic prokaryotes , 2010, Nature.

[113]  William Stafford Noble,et al.  A Three-Dimensional Model of the Yeast Genome , 2010, Nature.

[114]  P. Bork,et al.  A human gut microbial gene catalogue established by metagenomic sequencing , 2010, Nature.

[115]  T. Cremer,et al.  Chromosome territories. , 2010, Cold Spring Harbor perspectives in biology.

[116]  Joshua M. Stuart,et al.  Genome 10K: a proposal to obtain whole-genome sequence for 10,000 vertebrate species. , 2009, The Journal of heredity.

[117]  F. Grosveld,et al.  High-resolution identification of balanced and complex chromosomal rearrangements by 4C technology , 2009, Nature Methods.

[118]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[119]  E. Liu,et al.  An Oestrogen Receptor α-bound Human Chromatin Interactome , 2009, Nature.

[120]  S. Turner,et al.  Real-Time DNA Sequencing from Single Polymerase Molecules , 2009, Science.

[121]  Vineet Bafna,et al.  HapCUT: an efficient and accurate algorithm for the haplotype assembly problem , 2008, ECCB.

[122]  E. Birney,et al.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.

[123]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[124]  A. Halpern,et al.  The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific , 2007, PLoS biology.

[125]  B. Steensel,et al.  Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture–on-chip (4C) , 2006, Nature Genetics.

[126]  C. Nusbaum,et al.  Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements. , 2006, Genome research.

[127]  E. Delong,et al.  Community Genomics Among Stratified Microbial Assemblages in the Ocean's Interior , 2006, Science.

[128]  C. Woodcock A milestone in the odyssey of higher-order chromatin structure , 2005, Nature Structural &Molecular Biology.

[129]  Frank Oliver Glöckner,et al.  TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences , 2004, BMC Bioinformatics.

[130]  E. Lander,et al.  Finishing the euchromatic sequence of the human genome , 2004 .

[131]  J. Bonfield,et al.  Finishing the euchromatic sequence of the human genome , 2004, Nature.

[132]  L. Pennacchio,et al.  Finishing The Euchromatic Sequence Of The Human Genome , 2004 .

[133]  O. White,et al.  Environmental Genome Shotgun Sequencing of the Sargasso Sea , 2004, Science.

[134]  J. Banfield,et al.  Community structure and metabolism through reconstruction of microbial genomes from the environment , 2004, Nature.

[135]  S. Giovannoni,et al.  The uncultured microbial majority. , 2003, Annual review of microbiology.

[136]  J. Dekker,et al.  Capturing Chromosome Conformation , 2002, Science.

[137]  K Rippe,et al.  Making contacts on a nucleic acid polymer. , 2001, Trends in biochemical sciences.

[138]  H. Yokota,et al.  Size-dependent positioning of human chromosomes in interphase nuclei. , 2000, Biophysical journal.

[139]  S. Goodison,et al.  16S ribosomal DNA amplification for phylogenetic study , 1991, Journal of bacteriology.

[140]  R. Sinden,et al.  Chromosomes in living Escherichia coli cells are segregated into domains of supercoiling. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[141]  Donald E. Olins,et al.  Spheroid Chromatin Units (ν Bodies) , 1974, Science.