Topological Data Analysis Generates High-Resolution, Genome-wide Maps of Human Recombination.

Meiotic recombination is a fundamental evolutionary process driving diversity in eukaryotes. In mammals, recombination is known to occur preferentially at specific genomic regions. Using topological data analysis (TDA), a branch of applied topology that extracts global features from large data sets, we developed an efficient method for mapping recombination at fine scales. When compared to standard linkage-based methods, TDA can deal with a larger number of SNPs and genomes without incurring prohibitive computational costs. We applied TDA to 1,000 Genomes Project data and constructed high-resolution whole-genome recombination maps of seven human populations. Our analysis shows that recombination is generally under-represented within transcription start sites. However, the binding sites of specific transcription factors are enriched for sites of recombination. These include transcription factors that regulate the expression of meiosis- and gametogenesis-specific genes, cell cycle progression, and differentiation blockage. Additionally, our analysis identifies an enrichment for sites of recombination at repeat-derived loci matched by piwi-interacting RNAs.

[1]  P. Donnelly,et al.  The Fine-Scale Structure of Recombination Rate Variation in the Human Genome , 2004, Science.

[2]  Ian Tattersall,et al.  Out of Africa again ... and again , 1997 .

[3]  Dana C Crawford,et al.  Evidence for substantial fine-scale variation in recombination rates across the human genome , 2004, Nature Genetics.

[4]  Naohiro Terada,et al.  A Conserved E2F6-Binding Element in Murine Meiosis-Specific Gene Promoters1 , 2008, Biology of reproduction.

[5]  W. Richard McCombie,et al.  Sperm Methylation Profiles Reveal Features of Epigenetic Inheritance and Evolution in Primates , 2011, Cell.

[6]  Christian R Marshall,et al.  Sequencing of isolated sperm cells for direct haplotyping of a human genome , 2013, Genome research.

[7]  David Z. Chen,et al.  Architecture of the human regulatory network derived from ENCODE data , 2012, Nature.

[8]  Xiaofeng Zhu,et al.  The landscape of recombination in African Americans , 2011, Nature.

[9]  Linda Odenthal-Hesse,et al.  PRDM9 variation strongly influences recombination hot-spot activity and meiotic instability in humans , 2010, Nature Genetics.

[10]  K. Paigen,et al.  Prdm9 Controls Activation of Mammalian Recombination Hotspots , 2010, Science.

[11]  A. Gylfason,et al.  Fine-scale recombination rate differences between sexes, populations and individuals , 2010, Nature.

[12]  Herbert Edelsbrunner,et al.  Topological persistence and simplification , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[13]  C. Cannings,et al.  Recombination can evolve in large finite populations given selection on sufficient loci. , 2003, Genetics.

[14]  Stephen R. Quake,et al.  Genome-wide Single-Cell Analysis of Recombination Activity and De Novo Mutation Rates in Human Sperm , 2012, Cell.

[15]  R. Camerini-Otero,et al.  Recombination initiation maps of individual human genomes , 2014, Science.

[16]  Ingrid Lafontaine,et al.  Comparative genomics of hemiascomycete yeasts: genes involved in DNA replication, repair, and recombination. , 2005, Molecular biology and evolution.

[17]  Sai Lakshmi Subramanian,et al.  piRNABank: a web resource on classified and clustered Piwi-interacting RNAs , 2007, Nucleic Acids Res..

[18]  F. Tang,et al.  The Transcriptome and DNA Methylome Landscapes of Human Primordial Germ Cells , 2015, Cell.

[19]  S. Keeney,et al.  Where the crossovers are: recombination distributions in mammals , 2004, Nature Reviews Genetics.

[20]  S. Keeney Spo11 and the Formation of DNA Double-Strand Breaks in Meiosis. , 2008, Genome dynamics and stability.

[21]  Benjamin S. Glicksberg,et al.  Identification of type 2 diabetes subgroups through topological analysis of patient similarity , 2015, Science Translational Medicine.

[22]  James A. Cuff,et al.  A Bivalent Chromatin Structure Marks Key Developmental Genes in Embryonic Stem Cells , 2006, Cell.

[23]  Kenny Q. Ye,et al.  Mapping copy number variation by population scale genome sequencing , 2010, Nature.

[24]  P. Donnelly,et al.  A Fine-Scale Map of Recombination Rates and Hotspots Across the Human Genome , 2005, Science.

[25]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[26]  Jahnvi Pflueger,et al.  Distinctive chromatin in human sperm packages genes for embryo development , 2009 .

[27]  G. Coop,et al.  High-Resolution Mapping of Crossovers Reveals Extensive Variation in Fine-Scale Recombination Patterns Among Humans , 2008, Science.

[28]  Martin Renqiang Min,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[29]  Damien Neuillet,et al.  Dnmt3b recruitment through E2F6 transcriptional repressor mediates germ-line gene silencing in murine somatic tissues , 2010, Proceedings of the National Academy of Sciences.

[30]  R. Ho Algebraic Topology , 2022 .

[31]  G. Carlsson,et al.  Topology of viral evolution , 2013, Proceedings of the National Academy of Sciences.

[32]  H. Tsubouchi,et al.  DNA Recombination , 2011, Methods in Molecular Biology.

[33]  R. Hawley,et al.  Recombination and nondisjunction in humans and flies. , 1996, Human molecular genetics.

[34]  Toshiaki Watanabe,et al.  Retrotransposons and pseudogenes regulate mRNAs and lncRNAs via the piRNA pathway in the germline , 2015, Genome research.

[35]  Afra Zomorodian,et al.  Computing Persistent Homology , 2004, SCG '04.

[36]  Aviv Regev,et al.  DNA methylation dynamics of the human preimplantation embryo , 2014, Nature.

[37]  Andrew Rambaut,et al.  Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees , 1997, Comput. Appl. Biosci..

[38]  R. Griffiths,et al.  Bounds on the minimum number of recombination events in a sample history. , 2003, Genetics.

[39]  Xuan Zhu,et al.  A Hierarchical Combination of Factors Shapes the Genome-wide Topography of Yeast Meiotic Recombination Initiation , 2011, Cell.

[40]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[41]  P. Donnelly,et al.  Drive Against Hotspot Motifs in Primates Implicates the PRDM9 Gene in Meiotic Recombination , 2010, Science.

[42]  M. Stephens,et al.  Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. , 2003, Genetics.

[43]  Kevin Brick,et al.  Genome-wide analysis reveals novel molecular features of mouse recombination hotspots , 2011, Nature.

[44]  Cédric Feschotte,et al.  A comprehensive analysis of piRNAs from adult human testis and their relationship with genes and mobile elements , 2014, BMC Genomics.

[45]  Carsten Wiuf,et al.  Gene Genealogies, Variation and Evolution - A Primer in Coalescent Theory , 2004 .

[46]  A. Nicolas,et al.  The COMPASS Subunit Spp1 Links Histone Methylation to Initiation of Meiotic Recombination , 2013, Science.

[47]  Robert Ghrist,et al.  Elementary Applied Topology , 2014 .

[48]  G. Carlsson,et al.  Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival , 2011, Proceedings of the National Academy of Sciences.

[49]  Laurent Duret,et al.  Biased gene conversion and the evolution of mammalian genomic landscapes. , 2009, Annual review of genomics and human genetics.

[50]  G. Coop,et al.  PRDM9 Is a Major Determinant of Meiotic Recombination Hotspots in Humans and Mice , 2010, Science.

[51]  Ravi Sachidanandam,et al.  Developmentally Regulated piRNA Clusters Implicate MILI in Transposon Control , 2007, Science.

[52]  Jie Qiao,et al.  Probing Meiotic Recombination and Aneuploidy of Single Sperm Cells by Whole-Genome Sequencing , 2012, Science.

[53]  P. Fearnhead,et al.  A coalescent-based method for detecting and estimating recombination from gene sequences. , 2002, Genetics.

[54]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[55]  R. Hudson Two-locus sampling distributions and their application. , 2001, Genetics.

[56]  David G Johnson,et al.  Distinct and Overlapping Roles for E2F Family Members in Transcription, Proliferation and Apoptosis. , 2006, Current molecular medicine.

[57]  J. Wall,et al.  A comparison of estimators of the population recombination rate. , 2000, Molecular biology and evolution.

[58]  Richard R. Hudson,et al.  Generating samples under a Wright-Fisher neutral model of genetic variation , 2002, Bioinform..

[59]  K. Zhang,et al.  DNA Demethylation Dynamics in the Human Prenatal Germline , 2015, Cell.

[60]  J. Shendure,et al.  Primate evolution of the recombination regulator PRDM9 , 2014, Nature Communications.

[61]  Nevan J. Krogan,et al.  COMPASS: A complex of proteins associated with a trithorax-related SET domain protein , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[62]  Gunnar E. Carlsson,et al.  Topology and data , 2009 .

[63]  R. Hudson,et al.  Statistical properties of the number of recombination events in the history of a sample of DNA sequences. , 1985, Genetics.

[64]  Kevin Brick,et al.  Genetic recombination is directed away from functional genomic elements in mice , 2012, Nature.