Using linkage genome scans to improve power of association in genome scans.

Scanning the genome for association between markers and complex diseases typically requires testing hundreds of thousands of genetic polymorphisms. Testing such a large number of hypotheses exacerbates the trade-off between power to detect meaningful associations and the chance of making false discoveries. Even before the full genome is scanned, investigators often favor certain regions on the basis of the results of prior investigations, such as previous linkage scans. The remaining regions of the genome are investigated simultaneously because genotyping is relatively inexpensive compared with the cost of recruiting participants for a genetic study and because prior evidence is rarely sufficient to rule out these regions as harboring genes with variation of conferring liability (liability genes). However, the multiple testing inherent in broad genomic searches diminishes power to detect association, even for genes falling in regions of the genome favored a priori. Multiple testing problems of this nature are well suited for application of the false-discovery rate (FDR) principle, which can improve power. To enhance power further, a new FDR approach is proposed that involves weighting the hypotheses on the basis of prior data. We present a method for using linkage data to weight the association P values. Our investigations reveal that if the linkage study is informative, the procedure improves power considerably. Remarkably, the loss in power is small, even when the linkage study is uninformative. For a class of genetic models, we calculate the sample size required to obtain useful prior information from a linkage study. This inquiry reveals that, among genetic models that are seemingly equal in genetic information, some are much more promising than others for this mode of analysis.

[1]  L. Wasserman,et al.  Operating characteristics and extensions of the false discovery rate procedure , 2002 .

[2]  Chiara Sabatti,et al.  False discovery rate in linkage and association genome screens for complex disorders. , 2003, Genetics.

[3]  M Speer,et al.  Chromosome‐based method for rapid computer simulation in human genetic linkage analysis , 1993, Genetic epidemiology.

[4]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[5]  R. Tibshirani,et al.  Empirical bayes methods and false discovery rates for microarrays , 2002, Genetic epidemiology.

[6]  L. Heston,et al.  Novel association approach for determining the genetic predisposition to schizophrenia: case-control resource and testing of a candidate gene. , 1993, American journal of medical genetics.

[7]  L. Wasserman,et al.  Analysis of multilocus models of association , 2003, Genetic epidemiology.

[8]  Mitchell H Gail,et al.  Sample size calculations for population‐ and family‐based case‐control association studies on marker genotypes , 2003, Genetic epidemiology.

[9]  N. Risch Linkage strategies for genetically complex traits. I. Multilocus models. , 1990, American journal of human genetics.

[10]  John D. Storey,et al.  Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach , 2004 .

[11]  Daniel F. Gudbjartsson,et al.  Allegro, a new computer program for multipoint linkage analysis , 2000, Nature genetics.

[12]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Y. Benjamini,et al.  Multiple Hypotheses Testing with Weights , 1997 .

[14]  D. Duggan,et al.  Recent developments in genomewide association scans: a workshop summary and review. , 2005, American journal of human genetics.

[15]  N Risch,et al.  The Future of Genetic Studies of Complex Human Diseases , 1996, Science.

[16]  Brian D. Ripley,et al.  Modern applied statistics with S, 4th Edition , 2002, Statistics and computing.

[17]  Walter Krämer,et al.  Review of Modern applied statistics with S, 4th ed. by W.N. Venables and B.D. Ripley. Springer-Verlag 2002 , 2003 .

[18]  E. Lander,et al.  Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results , 1995, Nature Genetics.

[19]  John D. Storey A direct approach to false discovery rates , 2002 .

[20]  Alessandro Rinaldo,et al.  Characterization of multilocus linkage disequilibrium , 2005, Genetic epidemiology.

[21]  H. Muller The American Journal of Human Genetics Vol . 2 No . 2 June 1950 Our Load of Mutations 1 , 2006 .

[22]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[23]  K Roeder,et al.  Statistical Genetics: False discovery or missed discovery? , 2003, Heredity.

[24]  R. Elston,et al.  Optimal two‐stage genotyping in population‐based association studies , 2003, Genetic epidemiology.

[25]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[26]  Kathryn Roeder,et al.  Analysis of single‐locus tests to detect gene/disease associations , 2005, Genetic epidemiology.

[27]  L. Wasserman,et al.  False discovery control with p-value weighting , 2006 .

[28]  Larry Wasserman,et al.  Bayesian and Frequentist Multiple Testing , 2002 .

[29]  S. Bacanu Robust estimation of critical values for genome scans to detect linkage , 2005, Genetic epidemiology.