Employing MCMC under the PPL framework to analyze sequence data in large pedigrees

The increased feasibility of whole-genome (or whole-exome) sequencing has led to renewed interest in using family data to find disease mutations. For clinical phenotypes that lend themselves to study in large families, this approach can be particularly effective, because it may be possible to obtain strong evidence of a causal mutation segregating in a single pedigree even under conditions of extreme locus and/or allelic heterogeneity at the population level. In this paper, we extend our capacity to carry out positional mapping in large pedigrees, using a combination of linkage analysis and within-pedigree linkage trait-variant disequilibrium analysis to fine map down to the level of individual sequence variants. To do this, we develop a novel hybrid approach to the linkage portion, combining the non-stochastic approach to integration over the trait model implemented in the software package Kelvin, with Markov chain Monte Carlo-based approximation of the marker likelihood using blocked Gibbs sampling as implemented in the McSample program in the JPSGCS package. We illustrate both the positional mapping template, as well as the efficacy of the hybrid algorithm, in application to a single large pedigree with phenotypes simulated under a two-locus trait model.

[1]  R. Elston,et al.  A general model for the genetic analysis of pedigree data. , 1971, Human heredity.

[2]  V. Vieland,et al.  Adequacy of single-locus approximations for linkage analyses of oligogenic traits: extension to multigenerational pedigree structures. , 1993, Human heredity.

[3]  Alun Thomas Assessment of SNP streak statistics using gene drop simulation with linkage disequilibrium , 2010, Genetic epidemiology.

[4]  N. Reid,et al.  Likelihood , 1993 .

[5]  Yungui Huang,et al.  Association statistics under the PPL framework , 2010, Genetic epidemiology.

[6]  V. Vieland,et al.  Accumulating quantitative trait linkage evidence across multiple datasets using the posterior probability of linkage , 2007, Genetic epidemiology.

[7]  D. Greenberg,et al.  Inferring mode of inheritance by comparison of lod scores. , 1989, American journal of medical genetics.

[8]  Kai Wang,et al.  Evaluation of a Bayesian Model Integration-Based Method for Censored Data , 2012, Human Heredity.

[9]  C. Bonaïti‐pellié,et al.  Effects of misspecifying genetic parameters in lod score analysis. , 1986, Biometrics.

[10]  Terje O. Espelid,et al.  An Adaptive Multidimensional Integration Routine for a Vector of Integrals , 1991 .

[11]  V. Vieland,et al.  Adequacy of single‐locus approximations for linkage analyses of oligogenic traits , 1992, Genetic epidemiology.

[12]  Ümit V. Çatalyürek,et al.  KELVIN: A Software Package for Rigorous Measurement of Statistical Evidence in Human Genetics , 2011, Human Heredity.

[13]  Sang-Cheol Seok,et al.  Using projection and 2D plots to visually reveal genetic mechanisms of complex human disorders , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[14]  R C Elston,et al.  Man bites dog? The validity of maximizing lod scores to determine mode of inheritance. , 1989, American journal of medical genetics.

[15]  Haley J. Abel,et al.  Statistical Applications in Genetics and Molecular Biology Accuracy and Computational Efficiency of a Graphical Modeling Approach to Linkage Disequilibrium Estimation , 2011 .

[16]  Michael Evans,et al.  Fast and Accurate Calculation of a Computationally Intensive Statistic for Mapping Disease Genes , 2009, J. Comput. Biol..

[18]  C. A. Smith,et al.  Testing for heterogeneity of recombination fraction values in Human Genetics , 1963, Annals of human genetics.

[19]  L. Almasy,et al.  Multipoint quantitative-trait linkage analysis in general pedigrees. , 1998, American journal of human genetics.

[20]  J. Ott,et al.  A computer program for linkage analysis of general human pedigrees. , 1976, American journal of human genetics.

[21]  Alun Thomas,et al.  Multilocus linkage analysis by blocked Gibbs sampling , 2000, Stat. Comput..

[22]  E. Lander,et al.  Construction of multilocus genetic linkage maps in humans. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Terje O. Espelid,et al.  Algorithm 698: DCUHRE: an adaptive multidemensional integration routine for a vector of integrals , 1991, TOMS.