Moving toward a system genetics view of disease

Testing hundreds of thousands of DNA markers in human, mouse, and other species for association to complex traits like disease is now a reality. However, information on how variations in DNA impact complex physiologic processes flows through transcriptional and other molecular networks. In other words, DNA variations impact complex diseases through the perturbations they cause to transcriptional and other biological networks, and these molecular phenotypes are intermediate to clinically defined disease. Because it is also now possible to monitor transcript levels in a comprehensive fashion, integrating DNA variation, transcription, and phenotypic data has the potential to enhance identification of the associations between DNA variation and diseases like obesity and diabetes, as well as characterize those parts of the molecular networks that drive these diseases. Toward that end, we review methods for integrating expression quantitative trait loci (eQTLs), gene expression, and clinical data to infer causal relationships among gene expression traits and between expression and clinical traits. We further describe methods to integrate these data in a more comprehensive manner by constructing coexpression gene networks that leverage pairwise gene interaction data to represent more general relationships. To infer gene networks that capture causal information, we describe a Bayesian algorithm that further integrates eQTLs, expression, and clinical phenotype data to reconstruct whole-gene networks capable of representing causal relationships among genes and traits in the network. These emerging network approaches, aimed at processing high-dimensional biological data by integrating data from multiple sources, represent some of the first steps in statistical genetics to identify multiple genetic perturbations that alter the states of molecular networks and that in turn push systems into disease states. Evolving statistical procedures that operate on networks will be critical to extracting information related to complex phenotypes like disease, as research goes beyond a single-gene focus. The early successes achieved with the methods described herein suggest that these more integrative genomics approaches to dissecting disease traits will significantly enhance the identification of key drivers of disease beyond what could be achieved by genetic association studies alone.

[1]  Harrison W. Gabel,et al.  Functional Genomic Analysis of RNA Interference in C. elegans , 2005, Science.

[2]  E. Schadt,et al.  Genetic inheritance of gene expression in human cell lines. , 2004, American journal of human genetics.

[3]  G. Peltz,et al.  Identification of complement factor 5 as a susceptibility locus for experimental allergic asthma , 2000, Nature Immunology.

[4]  J. Friedman,et al.  Obesity modulates the expression of haptoglobin in the white adipose tissue via TNFα , 2002, Journal of cellular physiology.

[5]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[6]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[7]  Robert W. Williams,et al.  Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function , 2005, Nature Genetics.

[8]  C. Kendziorski,et al.  Statistical Methods for Expression Quantitative Trait Loci (eQTL) Mapping , 2006, Biometrics.

[9]  Rudi Alberts,et al.  A Statistical Multiprobe Model for Analyzing cis and trans Genes in Genetical Genomics Experiments With Short-Oligonucleotide Arrays , 2005, Genetics.

[10]  A. Roses,et al.  Novel polymorphism in the A4 region of the amyloid precursor protein gene in a patient without Alzheimer's disease , 1993, Neurology.

[11]  Aiqing He,et al.  Identification of inflammatory gene modules based on variations of human endothelial cell responses to oxidized lipids , 2006, Proceedings of the National Academy of Sciences.

[12]  E. Schadt Exploiting naturally occurring DNA variation and molecular profiling data to dissect disease and drug response traits. , 2005, Current opinion in biotechnology.

[13]  J. Castle,et al.  An integrative genomics approach to infer causal associations between gene expression and disease , 2005, Nature Genetics.

[14]  Enrico Petretto,et al.  Heritability and Tissue Specificity of Expression Quantitative Trait Loci , 2006, PLoS genetics.

[15]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[16]  F. Hu,et al.  A Common Genetic Variant Is Associated with Adult and Childhood Obesity , 2006, Science.

[17]  S. Friend,et al.  Embracing Complexity, Inching Closer to Reality , 2005, Science's STKE.

[18]  Z B Zeng,et al.  Genetic architecture of a morphological shape difference between two Drosophila species. , 2000, Genetics.

[19]  J. Nap,et al.  Genetical genomics: the added value from segregation. , 2001, Trends in genetics : TIG.

[20]  T. Hudson,et al.  A genome-wide association study identifies novel risk loci for type 2 diabetes , 2007, Nature.

[21]  R. Stoughton,et al.  Experimental annotation of the human genome using microarray technology , 2001, Nature.

[22]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[23]  J. Ott,et al.  Complement Factor H Polymorphism in Age-Related Macular Degeneration , 2005, Science.

[24]  宁北芳,et al.  疟原虫var基因转换速率变化导致抗原变异[英]/Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A , 2005 .

[25]  R G Ulrich,et al.  Clustering of hepatotoxins based on mechanism of toxicity using gene expression profiles. , 2001, Toxicology and applied pharmacology.

[26]  Albert-László Barabási,et al.  Genetic Dissection of Transcriptional Regulation in Budding Yeast , 2002 .

[27]  Eric E Schadt,et al.  Cis-regulatory variations: A study of SNPs around genes showing cis-linkage in segregating mouse populations , 2006, BMC Genomics.

[28]  Manjunatha Jagalur,et al.  Causal inference of regulator-target pairs by gene mapping of expression phenotypes , 2005, BMC Genomics.

[29]  M. Daly,et al.  PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes , 2003, Nature Genetics.

[30]  John D. Storey,et al.  Multiple Locus Linkage Analysis of Genomewide Expression in Yeast , 2005, PLoS biology.

[31]  Kai Stühler,et al.  Genetic analysis of the mouse brain proteome , 2002, Nature Genetics.

[32]  Rachel B. Brem,et al.  The landscape of genetic complexity across 5,700 gene expression traits in yeast. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Z B Zeng,et al.  Multiple trait analysis of genetic mapping for quantitative trait loci. , 1995, Genetics.

[34]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[35]  Jun Zhu,et al.  Increasing the Power to Detect Causal Associations by Combining Genotypic and Expression Data in Segregating Populations , 2007, PLoS Comput. Biol..

[36]  C. Molony,et al.  Genetic analysis of genome-wide variation in human gene expression , 2004, Nature.

[37]  Joshua T. Burdick,et al.  Mapping determinants of human gene expression by regional and genome-wide association , 2005, Nature.

[38]  Russell D. Wolfinger,et al.  The contributions of sex, genotype and age to transcriptional variance in Drosophila melanogaster , 2001, Nature Genetics.

[39]  Lan V. Zhang,et al.  Evidence for dynamically organized modularity in the yeast protein–protein interaction network , 2004, Nature.

[40]  Matthew C. Wiener,et al.  Increasing the Power to Detect Causal Associations among Genes and between Genes and Complex Traits by Combining Genotypic and Gene Expression Data in Segregating Populations , 2005 .

[41]  D. Pe’er,et al.  Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification , 2006, Proceedings of the National Academy of Sciences.

[42]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[43]  Eric E. Schadt,et al.  Integrating genotypic and expression data in a segregating mouse population to identify 5-lipoxygenase as a susceptibility gene for obesity and bone traits , 2005 .

[44]  G. Sumara,et al.  A Probabilistic Functional Network of Yeast Genes , 2004 .

[45]  A. Edwards,et al.  Complement Factor H Polymorphism and Age-Related Macular Degeneration , 2005, Science.

[46]  L. Kruglyak,et al.  Genetic Dissection of Transcriptional Regulation in Budding Yeast , 2002, Science.

[47]  J. Lamb,et al.  Elucidating the murine brain transcriptional network in a segregating mouse population to identify core functional modules for obesity and diabetes , 2006, Journal of neurochemistry.

[48]  E. Petretto,et al.  Integrated gene expression profiling and linkage analysis in the rat , 2006, Mammalian Genome.

[49]  D. Khatry,et al.  Expression profiling of blood samples from an SU5416 Phase III metastatic colorectal cancer clinical trial: a novel strategy for biomarker identification , 2003, BMC Cancer.

[50]  Eric E Schadt,et al.  Integrating QTL and high-density SNP analyses in mice to identify Insig2 as a susceptibility gene for plasma cholesterol levels. , 2005, Genomics.

[51]  S. Hunt,et al.  Genome-Wide Associations of Gene Expression Variation in Humans , 2005, PLoS genetics.

[52]  William H. Majoros,et al.  A Comparison of Whole-Genome Shotgun-Derived Mouse Chromosome 16 and the Human Genome , 2002, Science.

[53]  H. Stefánsson,et al.  Variant of transcription factor 7-like 2 (TCF7L2) gene confers risk of type 2 diabetes , 2006, Nature Genetics.

[54]  R. Stoughton,et al.  Genetics of gene expression surveyed in maize, mouse and man , 2003, Nature.

[55]  G. Churchill,et al.  Variation in gene expression within and among natural populations , 2002, Nature Genetics.

[56]  S. Horvath,et al.  Evidence for anti-Burkitt tumour globulins in Burkitt tumour patients and healthy individuals. , 1967, British Journal of Cancer.

[57]  Eric E Schadt,et al.  Cis-acting expression quantitative trait loci in mice. , 2005, Genome research.

[58]  E. Petretto,et al.  Integrated transcriptional profiling and linkage analysis for identification of genes underlying disease , 2005, Nature Genetics.

[59]  Vladimir Svetnik,et al.  A comprehensive transcript index of the human genome generated using microarrays and computational approaches , 2004, Genome Biology.

[60]  J. Castle,et al.  Genome-Wide Survey of Human Alternative Pre-mRNA Splicing with Exon Junction Microarrays , 2003, Science.

[61]  E. Schadt,et al.  Thematic review series: Systems Biology Approaches to Metabolic and Cardiovascular Disorders. Reverse engineering gene networks to identify key drivers of complex disease phenotypes Published, JLR Papers in Press, October 1, 2006. , 2006, Journal of Lipid Research.

[62]  J. Zhu,et al.  An integrative genomics approach to the reconstruction of gene networks in segregating populations , 2004, Cytogenetic and Genome Research.

[63]  E E Schadt,et al.  Integrating genotypic and expression data in a segregating mouse population to identify 5-lipoxygenase as a susceptibility gene for obesity and bone traits , 2005, Nature Genetics.

[64]  J. Gilbert,et al.  Complement Factor H Variant Increases the Risk of Age-Related Macular Degeneration , 2005, Science.

[65]  Dan Nettleton,et al.  Genetic Regulation of Gene Expression During Shoot Development in Arabidopsis , 2006, Genetics.