Sample Reproducibility of Genetic Association Using Different Multimarker TDTs in Genome-Wide Association Studies: Characterization and a New Approach

Multimarker Transmission/Disequilibrium Tests (TDTs) are very robust association tests to population admixture and structure which may be used to identify susceptibility loci in genome-wide association studies. Multimarker TDTs using several markers may increase power by capturing high-degree associations. However, there is also a risk of spurious associations and power reduction due to the increase in degrees of freedom. In this study we show that associations found by tests built on simple null hypotheses are highly reproducible in a second independent data set regardless the number of markers. As a test exhibiting this feature to its maximum, we introduce the multimarker -Groups TDT ( ), a test which under the hypothesis of no linkage, asymptotically follows a distribution with degree of freedom regardless the number of markers. The statistic requires the division of parental haplotypes into two groups: disease susceptibility and disease protective haplotype groups. We assessed the test behavior by performing an extensive simulation study as well as a real-data study using several data sets of two complex diseases. We show that test is highly efficient and it achieves the highest power among all the tests used, even when the null hypothesis is tested in a second independent data set. Therefore, turns out to be a very promising multimarker TDT to perform genome-wide searches for disease susceptibility loci that may be used as a preprocessing step in the construction of more accurate genetic models to predict individual susceptibility to complex diseases.

[1]  E. Génin,et al.  Use of closely related affected individuals for the genetic study of complex diseases in founder populations. , 2001, American journal of human genetics.

[2]  P Sham,et al.  Transmission/disequilibrium tests for multiallelic loci. , 1997, American journal of human genetics.

[3]  Hannu Toivonen,et al.  TreeDT: tree pattern mining for gene mapping , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[4]  Peter Donnelly,et al.  A new multipoint method for genome-wide association studies via imputation of genotypes : Supplementary Methods , 2007 .

[5]  C Charles Gu,et al.  Genetic association mapping under founder heterogeneity via weighted haplotype similarity analysis in candidate genes , 2004, Genetic epidemiology.

[6]  D. Clayton,et al.  Transmission/disequilibrium tests for extended marker haplotypes. , 1999, American journal of human genetics.

[7]  María M. Abad-Grau,et al.  IL2RA/CD25 Gene Polymorphisms: Uneven Association with Multiple Sclerosis (MS) and Type 1 Diabetes (T1D) , 2009, PloS one.

[8]  A. Verma,et al.  Risk Alleles for Multiple Sclerosis Identified by a Genomewide Study , 2008 .

[9]  Richard R. Hudson,et al.  Generating samples under a Wright-Fisher neutral model of genetic variation , 2002, Bioinform..

[10]  María M. Abad-Grau,et al.  BioCASE: Accelerating Software Development of Genome-Wide Filtering Applications , 2009, IWANN.

[11]  Chengjie Xiong,et al.  Global transmission/disequilibrium tests based on haplotype sharing in multiple candidate genes , 2005, Genetic epidemiology.

[12]  Momiao Xiong,et al.  An entropy-based genome-wide transmission/disequilibrium test , 2007, Human Genetics.

[13]  Jianping Dong,et al.  Transmission/disequilibrium test based on haplotype sharing for tightly linked markers. , 2003, American journal of human genetics.

[14]  Margaret A. Pericak-Vance,et al.  The role of the CD58 locus in multiple sclerosis , 2009, Proceedings of the National Academy of Sciences.

[15]  R A Betensky,et al.  Simple approximations for the maximal transmission/disequilibrium test with a multi‐allelic marker , 2000, Annals of human genetics.

[16]  K Roeder,et al.  Haplotype fine mapping by evolutionary trees. , 2000, American journal of human genetics.

[17]  Gudmundur A. Thorisson,et al.  The International HapMap Project Web site. , 2005, Genome research.

[18]  Patricia Margaritte-Jeannin,et al.  Maximum Identity Length Contrast: A Powerful Method For Susceptibility Gene Detection in Isolated Populations , 2001, Genetic epidemiology.

[19]  K. Roeder,et al.  Transmission/disequilibrium test meets measured haplotype analysis: family-based association analysis guided by evolution of haplotypes. , 2001, American journal of human genetics.

[20]  J. Todd,et al.  CD226 Gly307Ser association with multiple autoimmune diseases , 2009, Genes and Immunity.

[21]  Geoffrey B. Nilsen,et al.  Whole-Genome Patterns of Common DNA Variation in Three Human Populations , 2005, Science.

[22]  H. Harbo,et al.  Variation in interleukin 7 receptor alpha chain (IL7R) influences risk of multiple sclerosis. , 2007, Nature genetics.

[23]  L. Wasserman,et al.  On the identification of disease mutations by the analysis of haplotype similarity and goodness of fit. , 2003, American journal of human genetics.

[24]  A. Alcina,et al.  IL2RA/CD25 polymorphisms contribute to multiple sclerosis susceptibility , 2007, Journal of Neurology.

[25]  Sinead B. O'Leary,et al.  Genetic variation in the 5q31 cytokine gene cluster confers susceptibility to Crohn disease , 2001, Nature Genetics.

[26]  J. Harley IL-7Rα and multiple sclerosis risk , 2007, Nature Genetics.

[27]  Alessandro Rinaldo,et al.  Characterization of multilocus linkage disequilibrium , 2005, Genetic epidemiology.

[28]  Silke Schmidt,et al.  Interleukin 7 receptor α chain ( IL7R ) shows allelic and functional association with multiple sclerosis , 2007, Nature Genetics.

[29]  H. Harbo,et al.  Variation in interleukin 7 receptor α chain (IL7R) influences risk of multiple sclerosis , 2007, Nature Genetics.

[30]  Paul Scheet,et al.  A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. , 2006, American journal of human genetics.

[31]  M. Daly,et al.  High-resolution haplotype structure in the human genome , 2001, Nature Genetics.

[32]  L. Peltonen,et al.  Interferon regulatory factor 5 (IRF5) gene variants are associated with multiple sclerosis in three distinct populations , 2008, Journal of Medical Genetics.

[33]  Dana C Crawford,et al.  Evidence for substantial fine-scale variation in recombination rates across the human genome , 2004, Nature Genetics.

[34]  D. Curtis,et al.  An extended transmission/disequilibrium test (TDT) for multi‐allele marker loci , 1995, Annals of human genetics.

[35]  K. Wittkowski,et al.  An extension to a statistical approach for family based association studies provides insights into genetic risk factors for multiple sclerosis in the HLA-DRB1 gene , 2009, BMC Medical Genetics.

[36]  W. Ewens,et al.  Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). , 1993, American journal of human genetics.

[37]  F. Clerget-Darpoux,et al.  Statistical properties of the allelic and genotypic transmission/disequilibrium test for multiallelic markers , 1995, Genetic epidemiology.

[38]  F. Cucca,et al.  Variation within the CLEC16A gene shows consistent disease association with both multiple sclerosis and type 1 diabetes in Sardinia , 2009, Genes and Immunity.

[39]  D. Schaid General score tests for associations of genetic markers with disease using cases and their parents , 1996, Genetic epidemiology.

[40]  B S Weir,et al.  Power studies for the transmission/disequilibrium tests with multiple alleles. , 1997, American journal of human genetics.

[41]  Qiuying Sha,et al.  A Variable‐Sized Sliding‐Window Approach for Genetic Association Studies via Principal Component Analysis , 2009, Annals of human genetics.

[42]  M. Spence,et al.  Analysis of human genetic linkage , 1986 .

[43]  María M. Abad-Grau,et al.  Genome-wide association filtering using a highly locus-specific transmission/disequilibrium test , 2010, Human Genetics.

[44]  Garrett Hellenthal,et al.  msHOT: modifying Hudson's ms simulator to incorporate crossover and gene conversion hotspots , 2007, Bioinform..

[45]  S. Gabriel,et al.  The Structure of Haplotype Blocks in the Human Genome , 2002, Science.