A New Method for Detecting Associations with Rare Copy-Number Variants

Copy number variants (CNVs) play an important role in the etiology of many diseases such as cancers and psychiatric disorders. Due to a modest marginal effect size or the rarity of the CNVs, collapsing rare CNVs together and collectively evaluating their effect serves as a key approach to evaluating the collective effect of rare CNVs on disease risk. While a plethora of powerful collapsing methods are available for sequence variants (e.g., SNPs) in association analysis, these methods cannot be directly applied to rare CNVs due to the CNV-specific challenges, i.e., the multi-faceted nature of CNV polymorphisms (e.g., CNVs vary in size, type, dosage, and details of gene disruption), and etiological heterogeneity (e.g., heterogeneous effects of duplications and deletions that occur within a locus or in different loci). Existing CNV collapsing analysis methods (a.k.a. the burden test) tend to have suboptimal performance due to the fact that these methods often ignore heterogeneity and evaluate only the marginal effects of a CNV feature. We introduce CCRET, a random effects test for collapsing rare CNVs when searching for disease associations. CCRET is applicable to variants measured on a multi-categorical scale, collectively modeling the effects of multiple CNV features, and is robust to etiological heterogeneity. Multiple confounders can be simultaneously corrected. To evaluate the performance of CCRET, we conducted extensive simulations and analyzed large-scale schizophrenia datasets. We show that CCRET has powerful and robust performance under multiple types of etiological heterogeneity, and has performance comparable to or better than existing methods when there is no heterogeneity.

[1]  Jung-Ying Tzeng,et al.  Studying gene and gene-environment effects of uncommon and common variants on continuous traits: a marker-set approach using gene-trait similarity regression. , 2011, American journal of human genetics.

[2]  Jung-Ying Tzeng,et al.  On the Aggregation of Multimarker Information for Marker-Set and Sequencing Data Analysis: Genotype Collapsing vs. Similarity Collapsing , 2012, Front. Gene..

[3]  M. Owen,et al.  Chromosome 22qII deletions. An under-recognised cause of idiopathic learning disability. , 1998 .

[4]  Shamil R Sunyaev,et al.  Pooled association tests for rare variants in exon-resequencing studies. , 2010, American journal of human genetics.

[5]  Joshua M. Korn,et al.  Accurately Assessing the Risk of Schizophrenia Conferred by Rare Copy-Number Variation Affecting Genes with Brain Function , 2010, PLoS genetics.

[6]  John J. Connolly,et al.  Rare genomic deletions and duplications and their role in neurodevelopmental disorders. , 2012, Current topics in behavioral neurosciences.

[7]  Simon C. Potter,et al.  Genome-wide Association Analysis Identifies 14 New Risk Loci for Schizophrenia , 2013, Nature Genetics.

[8]  Avi Ma'ayan,et al.  Identification of small exonic CNV from whole-exome sequence data and application to autism spectrum disorder. , 2013, American journal of human genetics.

[9]  V. Bansal,et al.  Statistical analysis strategies for association studies involving rare variants , 2010, Nature Reviews Genetics.

[10]  P. Sullivan,et al.  Detecting large copy number variants using exome genotyping arrays in a large Swedish schizophrenia sample , 2013, Molecular Psychiatry.

[11]  Kathryn Roeder,et al.  Testing for an Unusual Distribution of Rare Variants , 2011, PLoS genetics.

[12]  R. Davies The distribution of a linear combination of 2 random variables , 1980 .

[13]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[14]  Juni Palmgren,et al.  The Swedish Twin Registry in the Third Millennium: An Update , 2006, Twin Research and Human Genetics.

[15]  Xihong Lin,et al.  Semiparametric Regression of Multidimensional Genetic Pathway Data: Least‐Squares Kernel Machines and Linear Mixed Models , 2007, Biometrics.

[16]  H. Putter,et al.  Meta-analysis of four new genome scans for lipid parameters and analysis of positional candidates in positive linkage regions , 2005, European Journal of Human Genetics.

[17]  Jung-Ying Tzeng,et al.  Haplotype-based association analysis via variance-components score test. , 2007, American journal of human genetics.

[18]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[19]  Eric S. Lander,et al.  A polygenic burden of rare disruptive mutations in schizophrenia , 2014, Nature.

[20]  Kenny Q. Ye,et al.  Mapping copy number variation by population scale genome sequencing , 2010, Nature.

[21]  M C O'Donovan,et al.  Copy number variation in schizophrenia in Sweden , 2014, Molecular Psychiatry.

[22]  E. Gamazon,et al.  Structural architecture of SNP effects on complex traits. , 2014, American journal of human genetics.

[23]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[24]  S Purcell,et al.  De novo CNV analysis implicates specific abnormalities of postsynaptic signalling complexes in the pathogenesis of schizophrenia , 2011, Molecular Psychiatry.

[25]  A. Singleton,et al.  Rare Structural Variants Disrupt Multiple Genes in Neurodevelopmental Pathways in Schizophrenia , 2008, Science.

[26]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[27]  D. Rujescu,et al.  Neurexin 1 (NRXN1) deletions in schizophrenia. , 2009, Schizophrenia bulletin.

[28]  Jung-Ying Tzeng,et al.  Assessing Gene-Environment Interactions for Common and Rare Variants with Binary Traits Using Gene-Trait Similarity Regression , 2015, Genetics.

[29]  J. Sebat,et al.  Duplications of the neuropeptide receptor gene VIPR2 confer significant risk for schizophrenia , 2011, Nature.

[30]  Dawei Liu,et al.  Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models , 2008, BMC Bioinformatics.

[31]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[32]  J. Sebat,et al.  CNVs: Harbingers of a Rare Variant Revolution in Psychiatric Genetics , 2012, Cell.

[33]  Scott B. Selleck,et al.  Global increases in both common and rare copy number load associated with autism , 2013, Human molecular genetics.

[34]  Bradley P. Coe,et al.  Genome structural variation discovery and genotyping , 2011, Nature Reviews Genetics.

[35]  Patrick F Sullivan,et al.  Recurrence risks for schizophrenia in a Swedish National Cohort , 2006, Psychological Medicine.

[36]  Xihong Lin,et al.  Test for interactions between a genetic marker set and environment in generalized linear models. , 2013, Biostatistics.

[37]  S. Leal,et al.  Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. , 2008, American journal of human genetics.

[38]  Jianxin Shi,et al.  Copy number variants in schizophrenia: confirmation of five previous findings and new evidence for 3q29 microdeletions and VIPR2 duplications. , 2011, The American journal of psychiatry.

[39]  R. Weksberg,et al.  Clinical features of 78 adults with 22q11 deletion syndrome , 2005, American journal of medical genetics. Part A.

[40]  P. Visscher,et al.  Rare chromosomal deletions and duplications increase risk of schizophrenia , 2008, Nature.

[41]  E. Banks,et al.  Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. , 2012, American journal of human genetics.

[42]  Nicholas G Martin,et al.  Two-locus Linkage Analysis Applied to Putative Quantitative Trait Loci for Lipoprotein(a) Levels , 2003, Twin Research.

[43]  G. Kirov,et al.  Evidence that duplications of 22q11.2 protect against schizophrenia , 2013, Molecular Psychiatry.

[44]  G. Abecasis,et al.  Rare-variant association analysis: study designs and statistical tests. , 2014, American journal of human genetics.

[45]  H. Stefánsson,et al.  Genome-Wide Analysis Shows Increased Frequency of Copy Number Variation Deletions in Dutch Schizophrenia Patients , 2011, Biological Psychiatry.

[46]  Pierre Lafaye de Micheaux,et al.  Computing the distribution of quadratic forms: Further comparisons between the Liu-Tang-Zhang approximation and exact methods , 2010, Comput. Stat. Data Anal..

[47]  G. Kirov,et al.  Support for the involvement of large copy number variants in the pathogenesis of schizophrenia. , 2009, Human molecular genetics.

[48]  Joseph T. Glessner,et al.  PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. , 2007, Genome research.

[49]  P. Lichtenstein,et al.  The Swedish Twin Registry in the Third Millennium , 2002, Twin Research.

[50]  Menachem Fromer,et al.  Using XHMM Software to Detect Copy Number Variation in Whole‐Exome Sequencing Data , 2014, Current protocols in human genetics.

[51]  Patrick F. Sullivan,et al.  Genetic architectures of psychiatric disorders: the emerging picture and its implications , 2012, Nature Reviews Genetics.

[52]  Jung-Ying Tzeng,et al.  Gene‐Trait Similarity Regression for Multimarker‐Based Association Analysis , 2009, Biometrics.

[53]  Xihong Lin,et al.  Optimal tests for rare variant effects in sequencing association studies. , 2012, Biostatistics.