Power calculations for genetic association studies using estimated probability distributions.

The determination of the power of-or of an appropriate sample size for-genetic association studies that exploit linkage disequilibrium requires many assumptions. Some of the more important assumptions include the linkage-disequilibrium strength among alleles at the observed marker-locus sites and a potential trait-influencing locus, the frequencies of the marker locus and trait-influencing alleles, and the ultimate density of the marker locus "map" (i.e., the number of bases between marker loci) necessary in order to identify, with some confidence, trait-influencing alleles. I consider an approach to assessment of the power and sample-size requirements of genetic case-control association study designs that makes use of empirically derived estimates of the distributions of important parameters often assumed to take on arbitrary values. My proposed methodology is extremely general and flexible and ultimately can provide realistic answers to questions such as "How many markers and/or how many individuals might it take to identify, with confidence, a disease gene, via linkage-disequilibrium and association methods from a candidate region or whole genome perspective?" I showcase aspects of the proposed methodology, using information abstracted from the literature.

[1]  K. Roeder,et al.  The power of genomic control. , 2000, American journal of human genetics.

[2]  T. Mackay,et al.  The genetic basis of quantitative variation: numbers of sensory bristles of Drosophila melanogaster as a model system. , 1995, Trends in genetics : TIG.

[3]  L R Cardon,et al.  Extent and distribution of linkage disequilibrium in three genomic regions. , 2001, American journal of human genetics.

[4]  Pardis C Sabeti,et al.  Linkage disequilibrium in the human genome , 2001, Nature.

[5]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[6]  T. Mackay The nature of quantittative genetic variation revisited: Lessons from Drosophila bristles , 1996, BioEssays : news and reviews in molecular, cellular and developmental biology.

[7]  N. Schork,et al.  Linkage disequilibrium analysis of biallelic DNA markers, human quantitative trait loci, and threshold-defined case and control subjects. , 2000, American journal of human genetics.

[8]  R. Lewontin,et al.  On measures of gametic disequilibrium. , 1988, Genetics.

[9]  M. Xiong,et al.  Haplotypes vs single marker linkage disequilibrium tests: what do we gain? , 2001, European Journal of Human Genetics.

[10]  N. Schork,et al.  The future of genetic case-control studies. , 2001, Advances in genetics.

[11]  A. Hoes Case-control studies. , 1995, The Netherlands journal of medicine.

[12]  Jeffrey C. Hall,et al.  Advances in Genetics , 1947 .

[13]  J. Witte,et al.  Linkage disequilibrium and allele-frequency distributions for 114 single-nucleotide polymorphisms in five populations. , 2000, American journal of human genetics.

[14]  Charles C. Taylor,et al.  Bootstrap choice of the smoothing parameter in kernel density estimation , 1989 .

[15]  N. Schork,et al.  Genetic analysis of case/control data using estimated haplotype frequencies: application to APOE locus variation and Alzheimer's disease. , 2001, Genome research.

[16]  BOOTSTRAP CHOICE OF SMOOTHING PARAMETER OF LOCALLY WEIGHTED LINEAR REGRESSION , 1993 .

[17]  K. Roeder,et al.  Genomic Control for Association Studies , 1999, Biometrics.

[18]  P. Donnelly,et al.  Association mapping in structured populations. , 2000, American journal of human genetics.

[19]  H. A. Orr,et al.  THE POPULATION GENETICS OF ADAPTATION: THE DISTRIBUTION OF FACTORS FIXED DURING ADAPTIVE EVOLUTION , 1998, Evolution; international journal of organic evolution.

[20]  C. Zapata THE D′ MEASURE OF OVERALL GAMETIC DISEQUILIBRIUM BETWEEN PAIRS OF MULTIALLELIC LOCI , 2000 .