Approximating the coalescent with recombination

The coalescent with recombination describes the distribution of genealogical histories and resulting patterns of genetic variation in samples of DNA sequences from natural populations. However, using the model as the basis for inference is currently severely restricted by the computational challenge of estimating the likelihood. We discuss why the coalescent with recombination is so challenging to work with and explore whether simpler models, under which inference is more tractable, may prove useful for genealogy-based inference. We introduce a simplification of the coalescent process in which coalescence between lineages with no overlapping ancestral material is banned. The resulting process has a simple Markovian structure when generating genealogies sequentially along a sequence, yet has very similar properties to the full model, both in terms of describing patterns of genetic variation and as the basis for statistical inference.

[1]  R. Fisher XV.—The Correlation between Relatives on the Supposition of Mendelian Inheritance. , 1919, Transactions of the Royal Society of Edinburgh.

[2]  S. Wright Evolution in mendelian populations , 1931 .

[3]  P. A. P. Moran,et al.  Random processes in genetics , 1958, Mathematical Proceedings of the Cambridge Philosophical Society.

[4]  E. Kreyszig,et al.  Advanced Engineering Mathematics. , 1974 .

[5]  T. Ohta,et al.  Linkage disequilibrium between two segregating nucleotide sites under the steady flux of mutations in a finite population. , 1971, Genetics.

[6]  W. G. Hill Linkage disequilibrium among multiple neutral alleles produced by mutation in finite population. , 1975, Theoretical population biology.

[7]  Walter M. Fitch,et al.  On the Problem of Discovering the Most Parsimonious Tree , 1977, The American Naturalist.

[8]  C. J-F,et al.  THE COALESCENT , 1980 .

[9]  G. Grimmett,et al.  Probability and random processes , 2002 .

[10]  R. Hudson Properties of a neutral allele model with intragenic recombination. , 1983, Theoretical population biology.

[11]  R. Hudson,et al.  Statistical properties of the number of recombination events in the history of a sample of DNA sequences. , 1985, Genetics.

[12]  W. G. Hill,et al.  Nonuniform recombination within the human beta-globin gene cluster. , 1986, American journal of human genetics.

[13]  R. Hudson,et al.  Estimating the recombination parameter of a finite population model without selection. , 1987, Genetical research.

[14]  E. Lander,et al.  Construction of multilocus genetic linkage maps in humans. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[15]  R. Hudson Gene genealogies and the coalescent process. , 1990 .

[16]  M. Slatkin,et al.  Estimation of levels of gene flow from DNA sequence data. , 1992, Genetics.

[17]  Jon A Yamato,et al.  Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling. , 1995, Genetics.

[18]  P. Donnelly,et al.  Optimal sequencing strategies for surveying molecular genetic diversity. , 1996, Genetics.

[19]  P. Marjoram,et al.  Ancestral Inference from Samples of DNA Sequences with Recombination , 1996, J. Comput. Biol..

[20]  J. Wakeley Using the variance of pairwise differences to estimate the recombination rate. , 1997, Genetical research.

[21]  R. Griffiths,et al.  An ancestral recombination graph , 1997 .

[22]  R. Griffiths The time to the ancestor along sequences with recombination. , 1999, Theoretical population biology.

[23]  J. Hein,et al.  Recombination as a point process along sequences. , 1999, Theoretical population biology.

[24]  P. Donnelly,et al.  Inference in molecular population genetics , 2000 .

[25]  R. Nielsen Estimation of population parameters and recombination rates from single nucleotide polymorphisms. , 2000, Genetics.

[26]  Jon A Yamato,et al.  Maximum likelihood estimation of recombination rates from population data. , 2000, Genetics.

[27]  J. Wall,et al.  A comparison of estimators of the population recombination rate. , 2000, Molecular biology and evolution.

[28]  Jon A Yamato,et al.  Usefulness of single nucleotide polymorphism data for estimating population parameters. , 2000, Genetics.

[29]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[30]  J. Hein,et al.  A simulation study of the reliability of recombination detection methods. , 2001, Molecular biology and evolution.

[31]  R. Hudson Two-locus sampling distributions and their application. , 2001, Genetics.

[32]  A. Jeffreys,et al.  Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex , 2001, Nature Genetics.

[33]  Jun S. Liu,et al.  Monte Carlo strategies in scientific computing , 2001 .

[34]  P. Donnelly,et al.  Estimating recombination rates from population genetic data. , 2001, Genetics.

[35]  P. Donnelly,et al.  Approximate likelihood methods for estimating local recombination rates , 2002 .

[36]  G. McVean,et al.  A genealogical interpretation of linkage disequilibrium. , 2002, Genetics.

[37]  D. Balding,et al.  Approximate Bayesian computation in population genetics. , 2002, Genetics.

[38]  A. Davison,et al.  Report of the Editors—2001 , 2002 .

[39]  Esko Ukkonen,et al.  Finding Founder Sequences from a Set of Recombinants , 2002, WABI.

[40]  P. Fearnhead,et al.  A coalescent-based method for detecting and estimating recombination from gene sequences. , 2002, Genetics.

[41]  G. McVean,et al.  Estimating recombination rates from population-genetic data , 2003, Nature Reviews Genetics.

[42]  M. Stephens,et al.  Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. , 2003, Genetics.

[43]  Peter Donnelly,et al.  A comparison of bayesian methods for haplotype reconstruction from population genotype data. , 2003, American journal of human genetics.

[44]  M. Stephens,et al.  Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. , 2003, Genetics.

[45]  R. Griffiths,et al.  Bounds on the minimum number of recombination events in a sample history. , 2003, Genetics.

[46]  Scott M. Williams,et al.  A high-density admixture map for disease gene discovery in african americans. , 2004, American journal of human genetics.

[47]  Dana C Crawford,et al.  Evidence for substantial fine-scale variation in recombination rates across the human genome , 2004, Nature Genetics.

[48]  P. Donnelly,et al.  The Fine-Scale Structure of Recombination Rate Variation in the Human Genome , 2004, Science.

[49]  Jeffrey D. Wall,et al.  Estimating Recombination Rates Using Three-Site Likelihoods , 2004, Genetics.

[50]  Peter Donnelly,et al.  Application of Coalescent Methods to Reveal Fine-Scale Rate Variation and Recombination Hotspots , 2004, Genetics.

[51]  M. Daly,et al.  Methods for high-density admixture mapping of disease genes. , 2004, American journal of human genetics.

[52]  Matthew Stephens,et al.  Absence of the TAP2 Human Recombination Hotspot in Chimpanzees , 2004, PLoS biology.

[53]  W. G. Hill,et al.  Linkage disequilibrium in finite populations , 1968, Theoretical and Applied Genetics.

[54]  Yun S. Song,et al.  Constructing Minimal Ancestral Recombination Graphs , 2005, J. Comput. Biol..

[55]  M. Stephens,et al.  Accounting for Decay of Linkage Disequilibrium in Haplotype Inference and Missing-data Imputation , 2022 .

[56]  Ali Esmaili,et al.  Probability and Random Processes , 2005, Technometrics.

[57]  Fengzhu Sun,et al.  A model-based approach to selection of tag SNPs , 2006, BMC Bioinformatics.

[58]  Zhaohui S. Qin,et al.  A comparison of phasing algorithms for trios and unrelated individuals. , 2006, American journal of human genetics.

[59]  Paul Scheet,et al.  A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. , 2006, American journal of human genetics.

[60]  L. Penrose,et al.  THE CORRELATION BETWEEN RELATIVES ON THE SUPPOSITION OF MENDELIAN INHERITANCE , 2022 .