Outlier Detection and False Discovery Rates for Whole-Genome DNA Matching

We define a statistic, called the matching statistic, for locating regions of the genome that exhibit excess similarity among cases when compared to controls. Such regions are reasonable candidates for harboring disease genes. We find the asymptotic distribution of the statistic while accounting for correlations among sampled individuals. We then use the Benjamini and Hochberg false discovery rate (FDR) method for multiple hypothesis testing to find regions of excess sharing. The p values for each region involve estimated nuisance parameters. Under appropriate conditions, we show that the FDR method based on p values and with estimated nuisance parameters asymptotically preserves the FDR property. Finally, we apply the method to a pilot study on schizophrenia.

[1]  B Devlin,et al.  Genomic control for association studies: a semiparametric test to detect excess-haplotype sharing. , 2000, Biostatistics.

[2]  Jun Takayama Early Pottery and Population Movements in Micronesian Prehistory , 1984 .

[3]  M. McPeek,et al.  Assessment of linkage disequilibrium by the decay of haplotype sharing, with application to fine-scale genetic mapping. , 1999, American journal of human genetics.

[4]  K. Roeder,et al.  Genome-wide distribution of linkage disequilibrium in the population of Palau and its implications for gene flow in Remote Oceania , 2001, Human Genetics.

[5]  K. Roeder,et al.  Genomic Control for Association Studies , 1999, Biometrics.

[6]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[7]  L. Wasserman,et al.  Genomic control, a new approach to genetic-based association studies. , 2001, Theoretical population biology.

[8]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[9]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[10]  N Risch,et al.  The Future of Genetic Studies of Complex Human Diseases , 1996, Science.

[11]  Christopher R. Genovese,et al.  Operating Characteristics and Extensions of the FDR Procedure , 2001 .

[12]  J. Ott,et al.  Predicting the range of linkage disequilibrium. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[13]  K Lange,et al.  Descent graphs in pedigree analysis: applications to haplotyping, location scores, and marker-sharing statistics. , 1996, American journal of human genetics.

[14]  D J Schaid,et al.  Evaluation of candidate genes in case-control studies: a statistical method to account for related subjects. , 2001, American journal of human genetics.

[15]  P. Brown,et al.  BLOOD GROUP GENETIC VARIATIONS IN NATIVES OF THE CAROLINE ISLANDS AND IN OTHER PARTS OF MICRONESIA , 1965 .