In Silico Study of Transcriptome Genetic Variation in Outbred Populations

Dissecting the genetic architecture of regulatory elements on a genome-wide basis is now technically feasible. The potential medical and genetical implications of this kind of experiment being very large, it is paramount to assess the reliability and repeatability of the results. This is especially relevant in outbred populations, such as humans, where the genetic architecture is necessarily more complex than in crosses between inbred lines. Here we simulated a chromosome-wide SNP association study using real human microarray data. Our model predicted, as observed, a highly significant clustering of quantitative trait loci (QTL) for gene expression. Importantly, the estimates of QTL positions were often unstable, and a decrease in the number of individuals of 16% resulted in a loss of power of ∼30% and a large shift in the position estimate in ∼30–40% of the remaining significant QTL. We also found that the analysis of two repeated measures of the same mRNA can also result in two QTL that are located far apart. The intrinsic difficulties of analyzing outbred populations should not be underestimated. We anticipate that (many) conflicting results may be collected in the future if whole-genome association studies for mRNA levels are carried out in outbred populations.

[1]  A. Darvasi Genomics: Gene expression meets genetics , 2003, Nature.

[2]  R. Stoughton,et al.  Genetics of gene expression surveyed in maize, mouse and man , 2003, Nature.

[3]  B. J. Carey,et al.  Chromosome-wide distribution of haplotype blocks and the role of recombination hot spots , 2003, Nature Genetics.

[4]  Ash A. Alizadeh,et al.  Individuality and variation in gene expression patterns in human blood , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[5]  R. Spielman,et al.  Natural variation in human gene expression assessed in lymphoblastoid cells , 2003, Nature Genetics.

[6]  Ritsert C. Jansen,et al.  Studying complex biological systems using multifactorial perturbation , 2003, Nature Reviews Genetics.

[7]  Vivian G. Cheung,et al.  The genetics of variation in gene expression , 2002, Nature Genetics.

[8]  G. Churchill Fundamentals of experimental design for cDNA microarrays , 2002, Nature Genetics.

[9]  L. McIntyre,et al.  Combining mapping and arraying: An approach to candidate gene identification , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[10]  G. Churchill,et al.  Variation in gene expression within and among natural populations , 2002, Nature Genetics.

[11]  L. Staudt,et al.  The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. , 2002, The New England journal of medicine.

[12]  L. Kruglyak,et al.  Genetic Dissection of Transcriptional Regulation in Budding Yeast , 2002, Science.

[13]  Richard R. Hudson,et al.  Generating samples under a Wright-Fisher neutral model of genetic variation , 2002, Bioinform..

[14]  Pardis C Sabeti,et al.  Linkage disequilibrium in the human genome , 2001, Nature.

[15]  F. Baas,et al.  The Human Transcriptome Map: Clustering of Highly Expressed Genes in Chromosomal Domains , 2001, Science.

[16]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[17]  L. Kruglyak Prospects for whole-genome linkage disequilibrium mapping of common disease genes , 1999, Nature Genetics.

[18]  J. Ott Genetic data analysis II , 1997 .

[19]  M. Lynch,et al.  Genetics and Analysis of Quantitative Traits , 1996 .