Multipoint linkage analysis with many multiallelic or dense diallelic markers: Markov chain-Monte Carlo provides practical approaches for genome scans on general pedigrees.

Computations for genome scans need to adapt to the increasing use of dense diallelic markers as well as of full-chromosome multipoint linkage analysis with either diallelic or multiallelic markers. Whereas suitable exact-computation tools are available for use with small pedigrees, equivalent exact computation for larger pedigrees remains infeasible. Markov chain-Monte Carlo (MCMC)-based methods currently provide the only computationally practical option. To date, no systematic comparison of the performance of MCMC-based programs is available, nor have these programs been systematically evaluated for use with dense diallelic markers. Using simulated data, we evaluate the performance of two MCMC-based linkage-analysis programs--lm_markers from the MORGAN package and SimWalk2--under a variety of analysis conditions. Pedigrees consisted of 14, 52, or 98 individuals in 3, 5, or 6 generations, respectively, with increasing amounts of missing data in larger pedigrees. One hundred replicates of markers and trait data were simulated on a 100-cM chromosome, with up to 10 multiallelic and up to 200 diallelic markers used simultaneously for computation of multipoint LOD scores. Exact computation was available for comparison in most situations, and comparison with a perfectly informative marker or interprogram comparison was available in the remaining situations. Our results confirm the accuracy of both programs in multipoint analysis with multiallelic markers on pedigrees of varied sizes and missing-data patterns, but there are some computational differences. In contrast, for large numbers of dense diallelic markers, only the lm_markers program was able to provide accurate results within a computationally practical time. Thus, programs in the MORGAN package are the first available to provide a computationally practical option for accurate linkage analyses in genome scans with both large numbers of diallelic markers and large pedigrees.

[1]  Ina Hoeschele,et al.  Conditional Probability Methods for Haplotyping in Pedigrees , 2004, Genetics.

[2]  M. Z. Cader,et al.  A genome-wide screen and linkage mapping for a large pedigree with episodic ataxia , 2005, Neurology.

[3]  Ellen M Wijsman,et al.  MCMC Multilocus Lod Scores: Application of a New Approach , 2005, Human Heredity.

[4]  J. O’Connell,et al.  The VITESSE algorithm for rapid exact multilocus linkage analysis via genotype set–recoding and fuzzy inheritance , 1995, Nature Genetics.

[5]  S. Slager,et al.  Identification of genes involved in alcohol consumption and cigarettes smoking , 2005, BMC Genetics.

[6]  S. Chanock,et al.  Corroboration of a familial chordoma locus on chromosome 7q and evidence of genetic heterogeneity using single nucleotide polymorphisms (SNPs) , 2005, International journal of cancer.

[7]  Jeremy Heil,et al.  A 3.9-centimorgan-resolution human single-nucleotide polymorphism linkage map and screening set. , 2003, American journal of human genetics.

[8]  S. Heath Markov chain Monte Carlo segregation and linkage analysis for oligogenic models. , 1997, American journal of human genetics.

[9]  Elizabeth A. Thompson,et al.  MCMC IN THE ANALYSIS OF GENETIC DATA ON PEDIGREES , 2004 .

[10]  Ellen M Wijsman,et al.  Genomewide scan for real‐word reading subphenotypes of dyslexia: Novel chromosome 13 locus and genetic complexity , 2006, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[11]  Ellen M Wijsman,et al.  Summary of Group 8: Development and extension of linkage methods , 2003, Genetic epidemiology.

[12]  Tao Jiang,et al.  Computing the Minimum Recombinant Haplotype Configuration from Incomplete Genotype Data on a Pedigree by Integer Linear Programming , 2005, J. Comput. Biol..

[13]  E. Thompson,et al.  Estimation of conditional multilocus gene identity among relatives , 1999 .

[14]  G. Bernardi,et al.  New locus for hereditary spastic paraplegia maps to chromosome 1p31.1‐1p21.1 , 2005, Annals of neurology.

[15]  Ellen M Wijsman,et al.  Comparison of marker types and map assumptions using Markov chain Monte Carlo-based linkage analysis of COGA data , 2005, BMC Genetics.

[16]  E. Wijsman,et al.  Genome scan for quantitative trait loci influencing HDL levels: evidence for multilocus inheritance in familial combined hyperlipidemia , 2005, Human Genetics.

[17]  E. Wijsman,et al.  Comparison of single‐nucleotide polymorphisms and microsatellite markers for linkage analysis in the COGA and simulated data sets for Genetic Analysis Workshop 14: Presentation Groups 1, 2, and 3 , 2005, Genetic epidemiology.

[18]  Na Li,et al.  Approaches to mapping genetically correlated complex traits , 2003, BMC Genetics.

[19]  Alun Thomas,et al.  Multilocus linkage analysis by blocked Gibbs sampling , 2000, Stat. Comput..

[20]  R. Elston,et al.  A general model for the genetic analysis of pedigree data. , 1971, Human heredity.

[21]  K Lange,et al.  A random walk method for computing genetic location scores. , 1991, American journal of human genetics.

[22]  D. Balding,et al.  Handbook of statistical genetics , 2004 .

[23]  Y. Chien,et al.  Mapping of psoriasis to 17q terminus , 2005, Journal of Medical Genetics.

[24]  Pablo V Gejman,et al.  Genomewide linkage scan of 409 European-ancestry and African American families with schizophrenia: suggestive evidence of linkage at 8p23.3-p21.2 and 11p13.1-q14.1 in the combined sample. , 2006, American journal of human genetics.

[25]  Daniel J Schaid,et al.  Comparison of microsatellites versus single-nucleotide polymorphisms in a genome linkage screen for prostate cancer-susceptibility Loci. , 2004, American journal of human genetics.

[26]  Daniel E. Weeks,et al.  Mega2: data-handling for facilitating genetic linkage and association analyses , 2005, Bioinform..

[27]  Ellen M Wijsman,et al.  Evidence for a novel late-onset Alzheimer disease locus on chromosome 19p13.2. , 2004, American journal of human genetics.

[28]  Elizabeth A. Thompson,et al.  MCMC Estimation of Multi‐locus Genome Sharing and Multipoint Gene Location Scores , 2000 .

[29]  A. Kong,et al.  Sequential imputation and multipoint linkage analysis , 1993, Genetic epidemiology.

[30]  Jing Huang,et al.  Parallel genotyping of over 10,000 SNPs using a one-primer assay on a high-density oligonucleotide array. , 2004, Genome research.

[31]  Andrew George,et al.  Discovering Disease Genes: Multipoint Linkage Analysis via a New Markov Chain Monte Carlo Approach , 2003 .

[32]  D E Weeks,et al.  Multipoint Estimation of Identity-by-Descent Probabilities at Arbitrary Positions among Marker Loci on General Pedigrees , 2001, Human Heredity.

[33]  G. Abecasis,et al.  Merlin—rapid analysis of dense genetic maps using sparse gene flow trees , 2002, Nature Genetics.

[34]  K. Lange,et al.  Programs for pedigree analysis: Mendel, Fisher, and dGene , 1988, Genetic epidemiology.

[35]  Chiara Sabatti,et al.  Results of a SNP genome screen in a large Costa Rican pedigree segregating for severe bipolar disorder , 2006, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[36]  N. Morton Sequential tests for the detection of linkage. , 1955, American journal of human genetics.

[37]  E A Thompson,et al.  MCMC segregation and linkage analysis , 1997, Genetic epidemiology.

[38]  M. Boehnke,et al.  Limits of resolution of genetic linkage studies: implications for the positional cloning of human disease genes. , 1994, American journal of human genetics.

[39]  E A Thompson,et al.  Monte Carlo analysis on a large pedigree , 1993, Genetic epidemiology.

[40]  D. Qian,et al.  Minimum-recombinant haplotyping in pedigrees. , 2002, American journal of human genetics.

[41]  B. Kermani,et al.  A highly informative SNP linkage panel for human genetic studies , 2004, Nature Methods.

[42]  Faming Liang,et al.  Markov Chain Monte Carlo: Innovations And Applications , 2006 .

[43]  E. Wijsman,et al.  Genetic analysis of simulated oligogenic traits in nuclear and extended pedigrees: Summary of GAW10 contributions , 1997, Genetic epidemiology.

[44]  Emily L. Webb,et al.  SNPLINK: multipoint linkage analysis of densely distributed SNP data incorporating automated linkage disequilibrium removal , 2005, Bioinform..

[45]  E. Wijsman A deductive method of haplotype analysis in pedigrees. , 1987, American journal of human genetics.

[46]  K Lange,et al.  Descent graphs in pedigree analysis: applications to haplotyping, location scores, and marker-sharing statistics. , 1996, American journal of human genetics.

[47]  Elizabeth A. Thompson,et al.  Likelihood and linkage : from Fisher to the future , 1996 .

[48]  T. Speed,et al.  Evidence for a novel glaucoma locus at chromosome 3p21-22 , 2005, Human Genetics.

[49]  E. Thompson Monte Carlo Likelihood in Genetic Mapping , 1994 .

[50]  E. Lander,et al.  Construction of multilocus genetic linkage maps in humans. , 1987, Proceedings of the National Academy of Sciences of the United States of America.