The clustering of functionally related genes contributes to CNV-mediated disease

Clusters of functionally related genes can be disrupted by a single copy number variant (CNV). We demonstrate that the simultaneous disruption of multiple functionally related genes is a frequent and significant characteristic of de novo CNVs in patients with developmental disorders (P = 1 × 10(-3)). Using three different functional networks, we identified unexpectedly large numbers of functionally related genes within de novo CNVs from two large independent cohorts of individuals with developmental disorders. The presence of multiple functionally related genes was a significant predictor of a CNV's pathogenicity when compared to CNVs from apparently healthy individuals and a better predictor than the presence of known disease or haploinsufficient genes for larger CNVs. The functionally related genes found in the de novo CNVs belonged to 70% of all clusters of functionally related genes found across the genome. De novo CNVs were more likely to affect functional clusters and affect them to a greater extent than benign CNVs (P = 6 × 10(-4)). Furthermore, such clusters of functionally related genes are phenotypically informative: Different patients possessing CNVs that affect the same cluster of functionally related genes exhibit more similar phenotypes than expected (P < 0.05). The spanning of multiple functionally similar genes by single CNVs contributes substantially to how these variants exert their pathogenic effects.

[1]  Insuk Lee,et al.  Characterising and Predicting Haploinsufficiency in the Human Genome , 2010, PLoS genetics.

[2]  M. Hurles,et al.  Copy number variation in human health, disease, and evolution. , 2009, Annual review of genomics and human genetics.

[3]  Ulrich Stephani,et al.  Genome-Wide Copy Number Variation in Epilepsy: Novel Susceptibility Loci in Idiopathic Generalized and Focal Epilepsies , 2010, PLoS genetics.

[4]  Albert J. Vilella,et al.  EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. , 2009, Genome research.

[5]  Shailesh V. Date,et al.  A Probabilistic Functional Network of Yeast Genes , 2004, Science.

[6]  Andreas Heger,et al.  OPTIC: orthologous and paralogous transcripts in clades , 2007, Nucleic Acids Res..

[7]  Bradley P. Coe,et al.  Comparison of genome-wide array genomic hybridization platforms for the detection of copy number variants in idiopathic mental retardation , 2011, BMC Medical Genomics.

[8]  Varun Chandola,et al.  Similarity measures for categorical data , 2008, SDM 2008.

[9]  P. Michalak Coexpression, coregulation, and cofunctionality of neighboring genes in eukaryotic genomes. , 2008, Genomics.

[10]  E. Marcotte,et al.  Prioritizing candidate disease genes by network-based boosting of genome-wide association data. , 2011, Genome research.

[11]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[12]  A. V. Vulto-van Silfhout,et al.  Clinical Significance of De Novo and Inherited Copy‐Number Variation , 2013, Human mutation.

[13]  Claudia C Weber,et al.  Support for multiple classes of local expression clusters in Drosophila melanogaster, but no evidence for gene order conservation , 2011, Genome Biology.

[14]  Charles Lee,et al.  Copy number variations and clinical cytogenetic diagnosis of constitutional disorders , 2007, Nature Genetics.

[15]  J. F. Poyatos,et al.  Is optimal gene order impossible? , 2006, Trends in genetics : TIG.

[16]  E. Sonnhammer,et al.  Genomic gene clustering analysis of pathways in eukaryotes. , 2003, Genome research.

[17]  Han G Brunner,et al.  Identification of disease genes by whole genome CGH arrays. , 2005, Human molecular genetics.

[18]  A. McLysaght,et al.  Interacting gene clusters and the evolution of the vertebrate immune system. , 2008, Molecular biology and evolution.

[19]  E. Cuppen,et al.  Discovery of variants unmasked by hemizygous deletions , 2012, European Journal of Human Genetics.

[20]  Bradley P. Coe,et al.  The genetic variability and commonality of neurodevelopmental disease , 2012, American journal of medical genetics. Part C, Seminars in medical genetics.

[21]  Arcadi Navarro,et al.  Selection upon Genome Architecture: Conservation of Functional Neighborhoods with Changing Genes , 2010, PLoS Comput. Biol..

[22]  Leslie G Biesecker,et al.  Consensus statement: chromosomal microarray is a first-tier clinical diagnostic test for individuals with developmental disabilities or congenital anomalies. , 2010, American journal of human genetics.

[23]  Kathryn Roeder,et al.  Multiple Recurrent De Novo CNVs, Including Duplications of the 7q11.23 Williams Syndrome Region, Are Strongly Associated with Autism , 2011, Neuron.

[24]  A. Owen,et al.  A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae) , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Judith A. Blake,et al.  The Mouse Genome Database (MGD): mouse biology and model systems , 2007, Nucleic Acids Res..

[26]  Phillip W. Lord,et al.  Semantic Similarity in Biomedical Ontologies , 2009, PLoS Comput. Biol..

[27]  Karin S Kassahn,et al.  Identification of human haploinsufficient genes and their genomic proximity to segmental duplications , 2008, European Journal of Human Genetics.

[28]  Laurence D. Hurst,et al.  Evidence for co-evolution of gene order and recombination rate , 2003, Nature Genetics.

[29]  G. Dawson,et al.  Evidence for broader autism phenotype characteristics in parents from multiple‐incidence autism families , 2012, Autism research : official journal of the International Society for Autism Research.

[30]  Caleb Webber,et al.  Unbiased Functional Clustering of Gene Variants with a Phenotypic-Linkage Network , 2014, PLoS Comput. Biol..

[31]  Joshua M. Korn,et al.  De Novo Copy Number Variants Identify New Genes and Loci in Isolated, Sporadic Tetralogy of Fallot , 2009, Nature Genetics.

[32]  J. Rosenfeld,et al.  Copy number variants and infantile spasms: evidence for abnormalities in ventral forebrain development and pathways of synaptic function , 2011, European Journal of Human Genetics.

[33]  Sergey V Nuzhdin,et al.  Coordinated evolution of co-expressed gene clusters in the Drosophila transcriptome , 2008, BMC Evolutionary Biology.

[34]  A. McLysaght,et al.  The Evolution of Functional Gene Clusters in Eukaryote Genomes , 2009 .

[35]  Vipin Kumar,et al.  Similarity Measures for Categorical Data: A Comparative Evaluation , 2008, SDM.

[36]  Gautier Koscielny,et al.  Ensembl’s 10th year , 2009, Nucleic Acids Res..

[37]  K. Devriendt,et al.  The causality of de novo copy number variants is overestimated , 2011, European Journal of Human Genetics.

[38]  Martin J. Lercher,et al.  Clustering of housekeeping genes provides a unified model of gene order in the human genome , 2002, Nature Genetics.

[39]  J. Sebat,et al.  CNVs: Harbingers of a Rare Variant Revolution in Psychiatric Genetics , 2012, Cell.

[40]  Yen Kaow Ng,et al.  Positive correlation between gene coexpression and positional clustering in the zebrafish genome , 2009, BMC Genomics.

[41]  C. Webber,et al.  Functional Enrichment Analysis with Structural Variants: Pitfalls and Strategies , 2011, Cytogenetic and Genome Research.

[42]  Andrew J Sharp,et al.  Discovery of previously unidentified genomic disorders from the duplication architecture of the human genome , 2006, Nature Genetics.

[43]  Kengo Kinoshita,et al.  COXPRESdb: a database of coexpressed gene networks in mammals , 2007, Nucleic Acids Res..

[44]  I. Kohane,et al.  Inter-species differences of co-expression of neighboring genes in eukaryotic genomes , 2004, BMC Genomics.

[45]  K. H. Wolfe,et al.  Clusters of co-expressed genes in mammalian genomes are conserved by natural selection. , 2005, Molecular biology and evolution.

[46]  H. Bass London Dysmorphology Database, London Neurogenetics Database & Dysmorphology Photo Library on CD-ROM , 2002 .

[47]  J. Rashbass Online Mendelian Inheritance in Man. , 1995, Trends in genetics : TIG.

[48]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[49]  A. Reymond,et al.  KCTD13 is a major driver of mirrored neuroanatomical phenotypes of the 16p11.2 copy number variant , 2012, Nature.

[50]  G. Kirov,et al.  Support for the involvement of large copy number variants in the pathogenesis of schizophrenia. , 2009, Human molecular genetics.

[51]  Kenny Q. Ye,et al.  Strong Association of De Novo Copy Number Mutations with Autism , 2007, Science.

[52]  F. Baas,et al.  The Human Transcriptome Map: Clustering of Highly Expressed Genes in Chromosomal Domains , 2001, Science.

[53]  Michael Egmont-Petersen,et al.  Genome-wide Copy Number Profiling on High-density Bacterial Artificial Chromosomes, Single-nucleotide Polymorphisms, and Oligonucleotide Microarrays: A Platform Comparison based on Statistical Power Analysis , 2007, DNA research : an international journal for rapid publication of reports on genes and genomes.

[54]  Damian Smedley,et al.  Phenotypic overlap in the contribution of individual genes to CNV pathogenicity revealed by cross-species computational analysis of single-gene mutations in humans, mice and zebrafish , 2012, Disease Models & Mechanisms.

[55]  Xiaowu Gai,et al.  High-resolution mapping and analysis of copy number variations in the human genome: a data resource for clinical and research applications. , 2009, Genome research.

[56]  Gregory M. Cooper,et al.  A Copy Number Variation Morbidity Map of Developmental Delay , 2011, Nature Genetics.

[57]  Caleb Webber,et al.  Large‐scale objective association of mouse phenotypes with human symptoms through structural variation identified in patients with developmental disorders , 2012, Human mutation.

[58]  D. W. Goodall A New Similarity Index Based on Probability , 1966 .

[59]  Joseph A. Gogos,et al.  Strong association of de novo copy number mutations with sporadic schizophrenia , 2008, Nature Genetics.

[60]  Richard M Myers,et al.  Population analysis of large copy number variants and hotspots of human genetic disease. , 2009, American journal of human genetics.

[61]  Gerald M Rubin,et al.  Evidence for large domains of similarly expressed genes in the Drosophila genome , 2002, Journal of biology.

[62]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[63]  Louxin Zhang,et al.  Genome-scale analysis of positional clustering of mouse testis-specific genes , 2005, BMC Genomics.

[64]  S. Mundlos,et al.  The Human Phenotype Ontology , 2010, Clinical genetics.

[65]  G. Church,et al.  A computational analysis of whole-genome expression data reveals chromosomal domains of gene expression , 2000, Nature Genetics.

[66]  L. Duret,et al.  Evolutionary origin and maintenance of coexpressed gene clusters in mammals. , 2006, Molecular biology and evolution.

[67]  Manuel Corpas,et al.  DECIPHER: Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources. , 2009, American journal of human genetics.

[68]  P. Robinson,et al.  The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. , 2008, American journal of human genetics.

[69]  Ting Chen,et al.  An Integrated Probabilistic Model for Functional Prediction of Proteins , 2004, J. Comput. Biol..