An Alternative Foundation for the Planning and Evaluation of Linkage Analysis

The ‘multiple testing problem’ currently bedevils the field of genetic epidemiology. Briefly stated, this problem arises with the performance of more than one statistical test and results in an increased probability of committing at least one Type I error. The accepted/conventional way of dealing with this problem is based on the classical Neyman-Pearson statistical paradigm and involves adjusting one’s error probabilities. This adjustment is, however, problematic because in the process of doing that, one is also adjusting one’s measure of evidence. Investigators have actually become wary of looking at their data, for fear of having to adjust the strength of the evidence they observed at a given locus on the genome every time they conduct an additional test. In a companion paper in this issue (Strug & Hodge I), we presented an alternative statistical paradigm, the ‘evidential paradigm’, to be used when planning and evaluating linkage studies. The evidential paradigm uses the lod score as the measure of evidence (as opposed to a p value), and provides new, alternatively defined error probabilities (alternative to Type I and Type II error rates). We showed how this paradigm separates or decouples the two concepts of error probabilities and strength of the evidence. In the current paper we apply the evidential paradigm to the multiple testing problem – specifically, multiple testing in the context of linkage analysis. We advocate using the lod score as the sole measure of the strength of evidence; we then derive the corresponding probabilities of being misled by the data under different multiple testing scenarios. We distinguish two situations: performing multiple tests of a single hypothesis, vs. performing a single test of multiple hypotheses. For the first situation the probability of being misled remains small regardless of the number of times one tests the single hypothesis, as we show. For the second situation, we provide a rigorous argument outlining how replication samples themselves (analyzed in conjunction with the original sample) constitute appropriate adjustments for conducting multiple hypothesis tests on a data set.

[1]  Daniel Q. Naiman,et al.  A Comprehensive Method for Genome Scans , 2003, Human Heredity.

[2]  Alice S Whittemore,et al.  Genetic Association Studies: Time for a New Paradigm? , 2005, Cancer Epidemiology Biomarkers & Prevention.

[3]  M. Gladis,et al.  Genomewide significant linkage to recurrent, early-onset major depressive disorder on chromosome 15q. , 2004, American journal of human genetics.

[4]  J. Chotai,et al.  On the lod score method in linkage analysis , 1984, Annals of human genetics.

[5]  Lisa J. Strug,et al.  An Alternative Foundation for the Planning and Evaluation of Linkage Analysis , 2006, Human Heredity.

[6]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[7]  R C Elston,et al.  Lods, wrods, and mods: The interpretation of lod scores calculated under different models , 1994, Genetic epidemiology.

[8]  G. Casella,et al.  Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.

[9]  M. Spence,et al.  Simulated data for a complex genetic trait (Problem 2 for GAW11): How the model was developed, and why , 1999, Genetic epidemiology.

[10]  G. Abecasis,et al.  Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies , 2006, Nature Genetics.

[11]  D. Pal,et al.  Effect of misspecification of gene frequency on the two-point LOD score , 2001, European Journal of Human Genetics.

[12]  S E Hodge,et al.  Sensitivity of lod scores to changes in diagnostic status. , 1992, American journal of human genetics.

[13]  Jeffrey D Blume,et al.  Likelihood methods for measuring statistical evidence , 2002, Statistics in medicine.

[14]  R. Royall On the Probability of Observing Misleading Statistical Evidence , 2000 .

[15]  S E Hodge,et al.  Magnitude of type I error when single-locus linkage analysis is maximized over models: a simulation study. , 1997, American journal of human genetics.

[16]  C. Bonaïti‐pellié,et al.  Effects of misspecifying genetic parameters in lod score analysis. , 1986, Biometrics.

[17]  E. Lander,et al.  Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results , 1995, Nature Genetics.

[18]  K J Rothman,et al.  No Adjustments Are Needed for Multiple Comparisons , 1990, Epidemiology.

[19]  Ian Hacking Logic of Statistical Inference , 1965 .

[20]  H. Robbins Statistical Methods Related to the Law of the Iterated Logarithm , 1970 .

[21]  S E Hodge,et al.  The power to detect linkage in complex disease by means of simple LOD-score analyses. , 1998, American journal of human genetics.

[22]  R. Royall Statistical Evidence: A Likelihood Paradigm , 1997 .

[23]  Allan Birnbaum,et al.  More on Concepts of Statistical Evidence , 1972 .

[24]  V. Vieland,et al.  Statistical Evidence: A Likelihood Paradigm , 1998 .

[25]  Cedric A. B. Smith,et al.  The Detection of Linkage in Human Genetics , 1953 .

[26]  N. Morton Sequential tests for the detection of linkage. , 1955, American journal of human genetics.

[27]  J. Cornfield Sequential Trials, Sequential Analysis and the Likelihood Principle , 1966 .

[28]  R C Elston,et al.  Man bites dog? The validity of maximizing lod scores to determine mode of inheritance. , 1989, American journal of medical genetics.

[29]  J. Witte,et al.  Genetic dissection of complex traits , 1996, Nature Genetics.

[30]  D. Clayton,et al.  Betting odds and genetic associations. , 2004, Journal of the National Cancer Institute.

[31]  Robert V. Hogg,et al.  Introduction to Mathematical Statistics. , 1966 .

[32]  D. Saville Multiple Comparison Procedures: The Practical Solution , 1990 .