An Ancient Evolutionary Origin of Genes Associated with Human Genetic Diseases

Several thousand genes in the human genome have been linked to a heritable genetic disease. The majority of these appear to be nonessential genes (i.e., are not embryonically lethal when inactivated), and one could therefore speculate that they are late additions in the evolutionary lineage toward humans. Contrary to this expectation, we find that they are in fact significantly overrepresented among the genes that have emerged during the early evolution of the metazoa. Using a phylostratigraphic approach, we have studied the evolutionary emergence of such genes at 19 phylogenetic levels. The majority of disease genes was already present in the eukaryotic ancestor, and the second largest number has arisen around the time of evolution of multicellularity. Conversely, genes specific to the mammalian lineage are highly underrepresented. Hence, genes involved in genetic diseases are not simply a random subset of all genes in the genome but are biased toward ancient genes.

[1]  C. Gieger,et al.  Identification of ten loci associated with height highlights new biological pathways in human growth , 2008, Nature Genetics.

[2]  David M. Evans,et al.  Genome-wide association analysis identifies 20 loci that influence adult height , 2008, Nature Genetics.

[3]  Bruno Nyffeler,et al.  Early History of Mammals Is Elucidated with the ENCODE Multiple Species Sequencing Data , 2007, PLoS genetics.

[4]  Aleksey Y Ogurtsov,et al.  Bioinformatical assay of human gene morbidity. , 2004, Nucleic acids research.

[5]  Tomislav Domazet-Loso,et al.  A phylostratigraphy approach to uncover the genomic history of major adaptations in metazoan lineages. , 2007, Trends in genetics : TIG.

[6]  Sarah J. Bourlat,et al.  Deuterostome phylogeny reveals monophyletic chordates and the new phylum Xenoturbellida , 2006, Nature.

[7]  David Q. Matus,et al.  Broad phylogenomic sampling improves resolution of the animal tree of life , 2008, Nature.

[8]  M. Miller,et al.  Understanding human disease mutations through the use of interspecific genetic variation. , 2001, Human molecular genetics.

[9]  Sudhir Kumar,et al.  Comparative Genomics in Eukaryotes , 2005 .

[10]  Marcia M. Nizzari,et al.  Genome-Wide Association Analysis Identifies Loci for Type 2 Diabetes and Triglyceride Levels , 2007, Science.

[11]  Teri E. Klein,et al.  The functional importance of disease-associated mutation , 2002, BMC Bioinformatics.

[12]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[13]  M. McCarthy,et al.  Replication of Genome-Wide Association Signals in UK Samples Reveals Risk Loci for Type 2 Diabetes , 2007, Science.

[14]  D. Tautz,et al.  An evolutionary analysis of orphan genes in Drosophila. , 2003, Genome research.

[15]  M. Albà,et al.  Inverse relationship between evolutionary rate and age of mammalian genes. , 2005, Molecular biology and evolution.

[16]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[17]  C. Duve The origin of eukaryotes: a reappraisal , 2007, Nature Reviews Genetics.

[18]  Nicholas H. Putnam,et al.  The genome of the choanoflagellate Monosiga brevicollis and the origin of metazoans , 2008, Nature.

[19]  V. McKusick Mendelian inheritance in man , 1971 .

[20]  Peter M Visscher,et al.  Sizing up human height variation , 2008, Nature Genetics.

[21]  D. Vitkup,et al.  Network properties of genes harboring inherited disease mutations , 2008, Proceedings of the National Academy of Sciences.

[22]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[23]  D. Graur,et al.  The "inverse relationship between evolutionary rate and age of mammalian genes" is an artifact of increased genetic distance with rate of evolution and time of divergence. , 2006, Molecular biology and evolution.

[24]  K. Mossman The Wellcome Trust Case Control Consortium, U.K. , 2008 .

[25]  Jianzhi Zhang,et al.  Null mutations in human and mouse orthologs frequently result in different phenotypes , 2008, Proceedings of the National Academy of Sciences.

[26]  Jan O. Korbel,et al.  Positive selection at the protein network periphery: Evaluation in terms of structural constraints and cellular context , 2007, Proceedings of the National Academy of Sciences.

[27]  M. Albà,et al.  On homology searches by protein Blast and the characterization of the age of genes , 2007, BMC Evolutionary Biology.

[28]  Simon C. Potter,et al.  Association scan of 14,500 nonsynonymous SNPs in four diseases identifies autoimmunity variants , 2007, Nature Genetics.

[29]  H. Ochman,et al.  Bacterial genomes as new gene homes: the genealogy of ORFans in E. coli. , 2004, Genome research.

[30]  M. R. Adams,et al.  Comparative genomics of the eukaryotes. , 2000, Science.

[31]  A. Eyre-Walker,et al.  Human disease genes: patterns and predictions. , 2003, Gene.

[32]  Léon Personnaz,et al.  Enrichment or depletion of a GO category within a class of genes: which test? , 2007, Bioinform..