Backward Genotype-Trait Association (BGTA)-Based Dissection of Complex Traits in Case-Control Designs

Background: The studies of complex traits project new challenges to current methods that evaluate association between genotypes and a specific trait. Consideration of possible interactions among loci leads to overwhelming dimensions that cannot be handled using current statistical methods. Methods: In this article, we evaluate a multi-marker screening algorithm – the backward genotype-trait association (BGTA) algorithm for case-control designs, which uses unphased multi-locus genotypes. BGTA carries out a global investigation on a candidate marker set and automatically screens out markers carrying diminutive amounts of information regarding the trait in question. To address the ‘too many possible genotypes, too few informative chromosomes’ dilemma of a genomic-scale study that consists of hundreds to thousands of markers, we further investigate a BGTA-based marker selection procedure, in which the screening algorithm is repeated on a large number of random marker subsets. Results of these screenings are then aggregated into counts that the markers are retained by the BGTA algorithm. Markers with exceptional high counts of returns are selected for further analysis. Results and Conclusion: Evaluated using simulations under several disease models, the proposed methods prove to be more powerful in dealing with epistatic traits. We also demonstrate the proposed methods through an application to a study on the inflammatory bowel disease.

[1]  M. Newton Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis , 2008 .

[2]  Annette Lee,et al.  High-density SNP analysis of 642 Caucasian families with rheumatoid arthritis identifies two new linkage regions on 11p12 and 2q33 , 2006, Genes and Immunity.

[3]  Taylor J. Maxwell,et al.  A scan of chromosome 10 identifies a novel locus showing strong association with late-onset Alzheimer disease. , 2006, American journal of human genetics.

[4]  S. Lo,et al.  Combined Linkage and Association Analysis of the NARAC Dataset , 2006 .

[5]  Mariza de Andrade,et al.  High-resolution whole-genome association study of Parkinson disease. , 2005, American journal of human genetics.

[6]  Nan Hu,et al.  Genome-wide association study in esophageal cancer using GeneChip mapping 10K array. , 2005, Cancer research.

[7]  P. Donnelly,et al.  Genome-wide strategies for detecting multiple loci that influence complex diseases , 2005, Nature Genetics.

[8]  C. Becker,et al.  Association of the HLA region with multiple sclerosis as confirmed by a genome screen using >10,000 SNPs on DNA chips , 2005, Journal of Molecular Medicine.

[9]  D. Easton,et al.  An autosome-wide scan for linkage disequilibrium-based association in sporadic breast cancer cases in eastern Finland: three candidate regions found. , 2005, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[10]  T. Nakayama,et al.  The microsatellite alleles on chromosome 1 associated with essential hypertension and blood pressure levels , 2004, Journal of Human Hypertension.

[11]  Chris S. Haley,et al.  Epistasis: too often neglected in complex trait studies? , 2004, Nature Reviews Genetics.

[12]  Tian Zheng,et al.  A demonstration and findings of a statistical approach through reanalysis of inflammatory bowel disease data. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Lon R. Cardon,et al.  The complex interplay among factors that influence allelic association , 2004, Nature Reviews Genetics.

[14]  Xiping Xu,et al.  Power estimation of multiple SNP association test of case‐control study and application , 2004, Genetic epidemiology.

[15]  B. Efron Large-Scale Simultaneous Hypothesis Testing , 2004 .

[16]  Tian Zheng,et al.  Backward Haplotype Transmission Association (BHTA) Algorithm – A Fast Multiple-Marker Screening Method , 2002, Human Heredity.

[17]  H. Cordell Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. , 2002, Human molecular genetics.

[18]  Lon R. Cardon,et al.  A first-generation linkage disequilibrium map of human chromosome 22 , 2002, Nature.

[19]  S. Gabriel,et al.  The Structure of Haplotype Blocks in the Human Genome , 2002, Science.

[20]  R. Myers,et al.  Candidate-gene approaches for studying complex genetic traits: practical considerations , 2002, Nature Reviews Genetics.

[21]  L. Kruglyak,et al.  Patterns of linkage disequilibrium in the human genome , 2002, Nature Reviews Genetics.

[22]  Richard A. King,et al.  The genetic basis of common diseases. , 2002 .

[23]  J. Ott,et al.  Trimming, weighting, and grouping SNPs in human case-control association studies. , 2001, Genome research.

[24]  C. Day,et al.  Candidate gene case-control association studies: advantages and potential pitfalls. , 2001, British journal of clinical pharmacology.

[25]  L. Feuk,et al.  SNP association studies in Alzheimer's disease highlight problems for complex disease analysis. , 2001, Trends in genetics : TIG.

[26]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[27]  L. Cardon,et al.  Association study designs for complex diseases , 2001, Nature Reviews Genetics.

[28]  K. Kidd,et al.  Transmission/disequilibrium tests using multiple tightly linked markers. , 2000, American journal of human genetics.

[29]  N. Risch Searching for genetic determinants in the new millennium , 2000, Nature.

[30]  E S Lander,et al.  Genomewide search in Canadian families with inflammatory bowel disease reveals two novel susceptibility loci. , 2000, American journal of human genetics.

[31]  N. Schork,et al.  Who's afraid of epistasis? , 1996, Nature Genetics.

[32]  W J Ewens,et al.  The TDT and other family-based tests for linkage disequilibrium and association. , 1996, American journal of human genetics.

[33]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[34]  D. Curtis,et al.  An extended transmission/disequilibrium test (TDT) for multi‐allele marker loci , 1995, Annals of human genetics.

[35]  J. Rice,et al.  Two‐Locus models of disease , 1992, Genetic epidemiology.

[36]  N. Risch Linkage strategies for genetically complex traits. I. Multilocus models. , 1990, American journal of human genetics.

[37]  N. Risch Linkage strategies for genetically complex traits. II. The power of affected relative pairs. , 1990, American journal of human genetics.

[38]  C. Falk,et al.  Haplotype relative risks: an easy reliable way to construct a proper control sample for risk calculations , 1987, Annals of human genetics.

[39]  S E Hodge,et al.  Some epistatic two-locus models of disease. I. Relative risks and identity-by-descent distributions in affected sib pairs. , 1981, American journal of human genetics.

[40]  K. Pearson On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is Such that it Can be Reasonably Supposed to have Arisen from Random Sampling , 1900 .