‘Location, Location, Location’: a spatial approach for rare variant analysis and an application to a study on non-syndromic cleft lip with or without cleft palate

Motivation: For the analysis of rare variants in sequence data, numerous approaches have been suggested. Fixed and flexible threshold approaches collapse the rare variant information of a genomic region into a test statistic with reduced dimensionality. Alternatively, the rare variant information can be combined in statistical frameworks that are based on suitable regression models, machine learning, etc. Although the existing approaches provide powerful tests that can incorporate information on allele frequencies and prior biological knowledge, differences in the spatial clustering of rare variants between cases and controls cannot be incorporated. Based on the assumption that deleterious variants and protective variants cluster or occur in different parts of the genomic region of interest, we propose a testing strategy for rare variants that builds on spatial cluster methodology and that guides the identification of the biological relevant segments of the region. Our approach does not require any assumption about the directions of the genetic effects. Results: In simulation studies, we assess the power of the clustering approach and compare it with existing methodology. Our simulation results suggest that the clustering approach for rare variants is well powered, even in situations that are ideal for standard methods. The efficiency of our spatial clustering approach is not affected by the presence of rare variants that have opposite effect size directions. An application to a sequencing study for non-syndromic cleft lip with or without cleft palate (NSCL/P) demonstrates its practical relevance. The proposed testing strategy is applied to a genomic region on chromosome 15q13.3 that was implicated in NSCL/P etiology in a previous genome-wide association study, and its results are compared with standard approaches. Availability: Source code and documentation for the implementation in R will be provided online. Currently, the R-implementation only supports genotype data. We currently are working on an extension for VCF files. Contact: heide.fier@googlemail.com

[1]  W. G. Hill,et al.  Heritability in the genomics era — concepts and misconceptions , 2008, Nature Reviews Genetics.

[2]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[3]  Jocelyn E. Krebs,et al.  Comprar Lewin's Genes X | S. Kilpatrick | 9780763766320 | Jones & Bartlett Publishers , 2010 .

[4]  D. Goldstein Common genetic variation and human traits. , 2009, The New England journal of medicine.

[5]  Marcello Pagano,et al.  A Nonparametric Test of Gene Region Heterogeneity Associated With Phenotype , 2002 .

[6]  Marcello Pagano,et al.  The interpoint distance distribution as a descriptor of point patterns, with an application to spatial disease clustering , 2005, Statistics in medicine.

[7]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[8]  Christoph Lange,et al.  Power calculations for a general class of family-based association tests: dichotomous traits. , 2002, American journal of human genetics.

[9]  G. McVean,et al.  Differential confounding of rare and common variants in spatially structured populations , 2011, Nature Genetics.

[10]  Kathryn Roeder,et al.  Testing for an Unusual Distribution of Rare Variants , 2011, PLoS genetics.

[11]  Iuliana Ionita-Laza,et al.  Finding disease variants in Mendelian disorders by using sequence data: methods and applications. , 2011, American journal of human genetics.

[12]  Iuliana Ionita-Laza,et al.  A New Testing Strategy to Identify Rare Variants with Either Risk or Protective Effect on Disease , 2011, PLoS genetics.

[13]  W. Thilly,et al.  A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST). , 2007, Mutation research.

[14]  Jocelyn E. Krebs,et al.  Lewin's Genes X , 2009 .

[15]  A. R. Ansari,et al.  Rank-Sum Tests for Dispersions , 1960 .

[16]  S. Browning,et al.  A Groupwise Association Test for Rare Mutations Using a Weighted Sum Statistic , 2009, PLoS genetics.

[17]  A. Jugessur,et al.  Mutations in BMP4 are associated with subepithelial, microform, and overt cleft lip. , 2009, American journal of human genetics.

[18]  S. Cichon,et al.  Genome-wide association study identifies two susceptibility loci for nonsyndromic cleft lip with or without cleft palate , 2010, Nature Genetics.

[19]  Shamil R Sunyaev,et al.  Pooled association tests for rare variants in exon-resequencing studies. , 2010, American journal of human genetics.

[20]  A. Singleton,et al.  Genomewide association studies and human disease. , 2009, The New England journal of medicine.

[21]  Lee-Jen Wei,et al.  Pooled Association Tests for Rare Variants in Exon-Resequencing Studies , 2010 .

[22]  Jesse R. Raab,et al.  Insulators and promoters: closer than we think , 2010, Nature Reviews Genetics.

[23]  Xiang Zhao,et al.  Rescue of cleft palate in Msx1-deficient mice by transgenic Bmp4 reveals a network of BMP and Shh signaling in the regulation of mammalian palatogenesis. , 2002, Development.

[24]  Francis S Collins,et al.  A HapMap harvest of insights into the genetics of common disease. , 2008, The Journal of clinical investigation.

[25]  Leonid Kruglyak,et al.  The road to genome-wide association studies , 2008, Nature Reviews Genetics.

[26]  S. Leal,et al.  Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. , 2008, American journal of human genetics.