Mol Syst Biol. 5: 273
With the cost of DNA sequencing decreasing rapidly, it is likely that the genome sequences of many individuals will be determined. In fact, if half of the individuals in industrialized countries choose to have their genomes sequenced, then well over 500 million personal genome sequences will be determined. Currently, such genetic information is likely to be of limited value to the individual, as the number of loci that provide useful predictive information is quite small (probably less than 200). Indeed, recent analyses of common complex traits such as diabetes, body mass and height show that in each case the genetically identifiable contribution from multiple candidate loci (18 in the case of diabetes) is only a small percentage (less than 7%) of the total identifiable genetic load (Gaulton et al , 2008; Willer et al , 2009); thus, the interpretable genetic contributions that can be identified are quite minor. Presumably, either many low‐frequency alleles at different loci contribute to the genetic load or perhaps the many phenotypes are because of other phenomena such as synergistic effects between variants at more than one locus or between different loci and factors in the environment, recurrent spontaneous mutations, or epigenetic defects.
Regardless of which proves to be correct (likely a differing mixture of effects for different diseases), the ability to accurately correlate all bases with precise phenotypes is likely to be powerful only if a common set of phenotypes are scored. The power of 500 million sequences correlated with 500 million phenotypes can show both small contributions as well as help identify potential causative mutations. Indeed, a data set of this size would greatly exceed that of even the large genome‐wide association studies that typically analyze thousands of individuals to tens of thousands …
[1]
Joseph M. Jasinski.
"Computational Biology and Bioinformatics"
,
2006,
2006 International Conference of the IEEE Engineering in Medicine and Biology Society.
[2]
Eugene A. Kapp,et al.
Overview of the HUPO Plasma Proteome Project: Results from the pilot phase with 35 collaborating laboratories and multiple analytical groups, generating a core dataset of 3020 proteins and a publicly‐available database
,
2005,
Proteomics.
[3]
G. Church,et al.
The Personal Genome Project
,
2005,
Molecular systems biology.
[4]
R. Aebersold,et al.
Mass spectrometry-based proteomics
,
2003,
Nature.
[5]
Christian Gieger,et al.
Six new loci associated with body mass index highlight a neuronal influence on body weight regulation
,
2009,
Nature Genetics.
[6]
Laura J. Scott,et al.
Comprehensive Association Study of Type 2 Diabetes and Related Quantitative Traits With 222 Candidate Genes
,
2008,
Diabetes.
[7]
John T. Wei,et al.
Metabolomic profiles delineate potential role for sarcosine in prostate cancer progression
,
2009,
Nature.
[8]
M. Gerstein,et al.
RNA-Seq: a revolutionary tool for transcriptomics
,
2009,
Nature Reviews Genetics.