Statistical Power of Expression Quantitative Trait Loci for Mapping of Complex Trait Loci in Natural Populations

A number of recent genomewide surveys have found numerous QTL for gene expression, often with intermediate to high heritability values. As a result, there is currently a great deal of interest in genetical genomics—that is, the combination of genomewide expression data and molecular marker data to elucidate the genetics of complex traits. To date, most genetical genomics studies have focused on generating candidate genes for previously known trait loci or have otherwise leveraged existing knowledge about trait-related genes. The purpose of this study is to explore the potential for genetical genomics approaches in the context of genomewide scans for complex trait loci. I explore the expected strength of association between expression-level traits and a clinical trait, as a function of the underlying genetic model in natural populations. I give calculations of statistical power for detecting differential expression between affected and unaffected individuals. I model both reactive and causative expression-level traits with both additive and multiplicative multilocus models for the relationship between phenotype and genotype and explore a variety of assumptions about dominance, number of segregating loci, and other parameters. There are two key results. If a transcript is causative for the disease (in the sense that disease risk depends directly on transcript level), then the power to detect association between transcript and disease is quite good. Sample sizes on the order of 100 are sufficient for 80% power. On the other hand, if the transcript is reactive to a disease locus, then the correlation between expression-level traits and disease is low unless the expression-level trait shares several causative loci with the disease—that is, the expression-level trait itself is a complex trait. Thus, there is a trade-off between the power to show association between a reactive expression-level trait and the clinical trait of interest and the power to map expression-level QTL (eQTL) for that expression-level trait. Gene expression-level traits that are most strongly correlated with the clinical trait will themselves be complex traits and therefore often hard to map. Likewise, the expression-level traits that are easiest to map will tend to have a low correlation with the clinical trait. These results show some fundamental principles for understanding power in eQTL-based mapping studies.

[1]  B. Weir,et al.  The quantitative genetics of transcription. , 2005, Trends in genetics : TIG.

[2]  R. Elston,et al.  The investigation of linkage between a quantitative trait and a marker locus , 1972, Behavior genetics.

[3]  E. Schadt,et al.  Genetic inheritance of gene expression in human cell lines. , 2004, American journal of human genetics.

[4]  G. Peltz,et al.  Identification of complement factor 5 as a susceptibility locus for experimental allergic asthma , 2000, Nature Immunology.

[5]  L. McIntyre,et al.  Combining mapping and arraying: An approach to candidate gene identification , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Ping Wang,et al.  A review of statistical methods for expression quantitative trait loci mapping , 2006, Mammalian Genome.

[7]  R. Spielman,et al.  Natural variation in human gene expression assessed in lymphoblastoid cells , 2003, Nature Genetics.

[8]  N. Risch Linkage strategies for genetically complex traits. I. Multilocus models. , 1990, American journal of human genetics.

[9]  R. Sederoff,et al.  Coordinated Genetic Regulation of Growth and Lignin Revealed by Quantitative Trait Locus Analysis of cDNA Microarray Data in an Interspecific Backcross of Eucalyptus1 , 2004, Plant Physiology.

[10]  R. Stoughton,et al.  Genetics of gene expression surveyed in maize, mouse and man , 2003, Nature.

[11]  G. Gibson,et al.  Mixture modeling of transcript abundance classes in natural populations , 2007, Genome Biology.

[12]  G. Churchill,et al.  Variation in gene expression within and among natural populations , 2002, Nature Genetics.

[13]  N. Grishin,et al.  Accumulation of dietary cholesterol in sitosterolemia caused by mutations in adjacent ABC transporters. , 2000, Science.

[14]  Yan Cui,et al.  Integrative genetic analysis of transcription modules: towards filling the gap between genetic loci and inherited traits. , 2006, Human molecular genetics.

[15]  J. Cheverud Genetics and analysis of quantitative traits , 1999 .

[16]  Daniel Gianola,et al.  Combining gene expression and molecular marker information for mapping complex trait genes: a simulation study. , 2003, Genetics.

[17]  J. Sudbø,et al.  Gene-expression profiles in hereditary breast cancer. , 2001, The New England journal of medicine.

[18]  V. Vieland,et al.  Effect of allelic heterogeneity on the power of the transmission disequilibrium test , 2000, Genetic epidemiology.

[19]  S. Horvath,et al.  Evidence for anti-Burkitt tumour globulins in Burkitt tumour patients and healthy individuals. , 1967, British Journal of Cancer.

[20]  W. Ewens Genetics and analysis of quantitative traits , 1999 .

[21]  C. Molony,et al.  Genetic analysis of genome-wide variation in human gene expression , 2004, Nature.

[22]  Bert Vogelstein,et al.  Allelic Variation in Human Gene Expression , 2002, Science.

[23]  L. Kruglyak,et al.  Genetic Dissection of Transcriptional Regulation in Budding Yeast , 2002, Science.

[24]  E. Petretto,et al.  Integrated transcriptional profiling and linkage analysis for identification of genes underlying disease , 2005, Nature Genetics.

[25]  Yan Cui,et al.  Inferring gene transcriptional modulatory relations: a genetical genomics approach. , 2005, Human molecular genetics.

[26]  A. Vaughan,et al.  The Tangier disease gene product ABC1 controls the cellular apolipoprotein-mediated lipid removal pathway. , 1999, The Journal of clinical investigation.

[27]  E. Schadt,et al.  Genomic analysis of metabolic pathway gene expression in mice , 2005, Genome Biology.

[28]  L. Palmer,et al.  Genomewide scans of complex human diseases: true linkage is hard to find. , 2001, American journal of human genetics.

[29]  J. Castle,et al.  An integrative genomics approach to infer causal associations between gene expression and disease , 2005, Nature Genetics.

[30]  Paul Schliekelman,et al.  Multiplex relative risk and estimation of the number of loci underlying an inherited disease. , 2002, American journal of human genetics.

[31]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[32]  Rachel B. Brem,et al.  The landscape of genetic complexity across 5,700 gene expression traits in yeast. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Andrew I Su,et al.  Uncovering regulatory pathways that affect hematopoietic stem cell function using 'genetical genomics' , 2005, Nature Genetics.

[34]  Robert W. Williams,et al.  Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function , 2005, Nature Genetics.

[35]  N Risch,et al.  The Future of Genetic Studies of Complex Human Diseases , 1996, Science.

[36]  J. Nadeau,et al.  Finding Genes That Underlie Complex Traits , 2002, Science.

[37]  S. Horvath,et al.  A family-based test for correlation between gene expression and trait values. , 2003, American journal of human genetics.