Systematic analysis, comparison, and integration of disease based human genetic association data and mouse genetic phenotypic information

BackgroundThe genetic contributions to human common disorders and mouse genetic models of disease are complex and often overlapping. In common human diseases, unlike classical Mendelian disorders, genetic factors generally have small effect sizes, are multifactorial, and are highly pleiotropic. Likewise, mouse genetic models of disease often have pleiotropic and overlapping phenotypes. Moreover, phenotypic descriptions in the literature in both human and mouse are often poorly characterized and difficult to compare directly.MethodsIn this report, human genetic association results from the literature are summarized with regard to replication, disease phenotype, and gene specific results; and organized in the context of a systematic disease ontology. Similarly summarized mouse genetic disease models are organized within the Mammalian Phenotype ontology. Human and mouse disease and phenotype based gene sets are identified. These disease gene sets are then compared individually and in large groups through dendrogram analysis and hierarchical clustering analysis.ResultsHuman disease and mouse phenotype gene sets are shown to group into disease and phenotypically relevant groups at both a coarse and fine level based on gene sharing.ConclusionThis analysis provides a systematic and global perspective on the genetics of common human disease as compared to itself and in the context of mouse genetic models of disease.

[1]  Hajime Nawata,et al.  INHA promoter polymorphisms are associated with premature ovarian failure. , 2005, Molecular human reproduction.

[2]  W. Fitch,et al.  Construction of phylogenetic trees. , 1967, Science.

[3]  H. Drummond,et al.  Potential role for the common cystic fibrosis &Dgr;F508 mutation in Crohn's disease , 2007, Inflammatory bowel diseases.

[4]  David Liu,et al.  DAVID Knowledgebase: a gene-centered database integrating heterogeneous gene annotation resources to facilitate high-throughput gene functional analysis , 2007, BMC Bioinformatics.

[5]  N. Campbell Genetic association database , 2004, Nature Reviews Genetics.

[6]  M. DePamphilis,et al.  HUMAN DISEASE , 1957, The Ulster Medical Journal.

[7]  Yuanfang Guan,et al.  A Genomewide Functional Network for the Laboratory Mouse , 2008, PLoS Comput. Biol..

[8]  D. Chasman On the utility of gene set methods in genomewide association studies of quantitative traits , 2008, Genetic epidemiology.

[9]  Jean-Pierre A. Kocher,et al.  GLOSSI: a method to assess the association of genetic loci-sets with complex diseases , 2009, BMC Bioinformatics.

[10]  Bing Zhang,et al.  WebGestalt: an integrated system for exploring gene sets in various biological contexts , 2005, Nucleic Acids Res..

[11]  David J. Porteous,et al.  SUSPECTS : enabling fast and effective prioritization of positional candidates , 2005 .

[12]  J. Trent,et al.  Clustering of non-major histocompatibility complex susceptibility candidate loci in human autoimmune diseases. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Hwan-Gue Cho,et al.  PhyloDraw: a phylogenetic tree drawing system , 2000, Bioinform..

[14]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Ming Yi,et al.  WholePathwayScope: a comprehensive pathway-based analysis tool for high-throughput data , 2006, BMC Bioinformatics.

[16]  Brad T. Sherman,et al.  Extracting Biological Meaning from Large Gene Lists with DAVID , 2009, Current protocols in bioinformatics.

[17]  V. McKusick Mendelian Inheritance in Man and Its Online Version, OMIM , 2007, The American Journal of Human Genetics.

[18]  R. Benayed,et al.  Association of the homeobox transcription factor, ENGRAILED 2, 3, with autism spectrum disorder , 2004, Molecular Psychiatry.

[19]  S. Kasif,et al.  Network-Based Analysis of Affected Biological Processes in Type 2 Diabetes Models , 2007, PLoS genetics.

[20]  M. Matzuk,et al.  Interrelationship of growth differentiation factor 9 and inhibin in early folliculogenesis and ovarian tumorigenesis in mice. , 2004, Molecular endocrinology.

[21]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[22]  Pär Stattin,et al.  Cumulative association of five genetic variants with prostate cancer. , 2008, The New England journal of medicine.

[23]  Marit Holden,et al.  GSEA-SNP: applying gene set enrichment analysis to SNP data from genome-wide association studies , 2008, Bioinform..

[24]  R. Tibshirani,et al.  Classification and prediction of clinical Alzheimer's diagnosis based on plasma signaling proteins , 2007, Nature Medicine.

[25]  Peter M Visscher,et al.  Prediction of individual genetic risk of complex disease. , 2008, Current opinion in genetics & development.

[26]  G. V. Ommen,et al.  Medical genomics , 2001, European Journal of Human Genetics.

[27]  A. G. Heidema,et al.  The challenge for genetic epidemiologists: how to analyze large numbers of SNPs in relation to complex diseases , 2006, BMC Genetics.

[28]  M. Khoury,et al.  Tracking the epidemiology of human genes in the literature: the HuGE Published Literature database. , 2006, American journal of epidemiology.

[29]  J. Ioannidis Why Most Published Research Findings Are False , 2005, PLoS medicine.

[30]  Mary Goldman,et al.  The UCSC Genome Browser database: update 2011 , 2010, Nucleic Acids Res..

[31]  Werner Müller,et al.  Integration of mouse phenome data resources , 2007, Mammalian Genome.

[32]  Xinjing Wang,et al.  Increased prevalence of chronic rhinosinusitis in carriers of a cystic fibrosis mutation. , 2005, Archives of otolaryngology--head & neck surgery.

[33]  M. Slatkin Exchangeable Models of Complex Inherited Diseases , 2008, Genetics.

[34]  Elizabeth W Karlson,et al.  Replication of putative candidate-gene associations with rheumatoid arthritis in >4,000 samples from North America and Sweden: association of susceptibility with PTPN22, CTLA4, and PADI4. , 2005, American journal of human genetics.

[35]  Seon-Young Kim,et al.  PAGE: Parametric Analysis of Gene Set Enrichment , 2005, BMC Bioinform..

[36]  H. Stefánsson,et al.  Genetics of gene expression and its effect on disease , 2008, Nature.

[37]  Ting Wang,et al.  The UCSC Genome Browser Database: update 2009 , 2008, Nucleic Acids Res..

[38]  Jing Chen,et al.  GenomeTrafac: a whole genome resource for the detection of transcription factor binding site clusters associated with conventional and microRNA encoding genes conserved between mouse and human gene orthologs , 2006, Nucleic Acids Res..

[39]  Yang Wang,et al.  T1DBase, a community web-based resource for type 1 diabetes research , 2004, Nucleic Acids Res..

[40]  Y H Lee,et al.  The PTPN22 C1858T functional polymorphism and autoimmune diseases--a meta-analysis. , 2007, Rheumatology.

[41]  J. Millonig,et al.  En2 knockout mice display neurobehavioral and neurochemical alterations relevant to autism spectrum disorder , 2006, Brain Research.

[42]  K. Becker The common variants/multiple disease hypothesis of common complex genetic disorders. , 2004, Medical hypotheses.

[43]  A. Butte,et al.  Creation and implications of a phenome-genome network , 2006, Nature Biotechnology.

[44]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[45]  John P A Ioannidis,et al.  On the synthesis and interpretation of consistent but weak gene-disease associations in the era of genome-wide association studies. , 2007, International journal of epidemiology.

[46]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[47]  E R Martin,et al.  Multifactor dimensionality reduction-phenomics: a novel method to capture genetic heterogeneity with use of phenotypic variables. , 2007, American journal of human genetics.

[48]  Peter M Visscher,et al.  Harnessing the information contained within genome-wide association studies to improve individual prediction of complex disease risk. , 2009, Human molecular genetics.

[49]  Judith A. Blake,et al.  The Mouse Genome Database (MGD): mouse biology and model systems , 2007, Nucleic Acids Res..

[50]  Peng Yue,et al.  SNPs3D: Candidate gene and SNP selection for association studies , 2006, BMC Bioinformatics.