A Knowledge-Based Method for Association Studies on Complex Diseases

Complex disorders are a class of diseases whose phenotypic variance is caused by the interplay of multiple genetic and environmental factors. Analyzing the complexity underlying the genetic architecture of such traits may help develop more efficient diagnostic tests and therapeutic protocols. Despite the continuous advances in revealing the genetic basis of many of complex diseases using genome-wide association studies (GWAS), a major proportion of their genetic variance has remained unexplained, in part because GWAS are unable to reliably detect small individual risk contributions and to capture the underlying genetic heterogeneity. In this paper we describe a hypothesis-based method to analyze the association between multiple genetic factors and a complex phenotype. Starting from sets of markers selected based on preexisting biomedical knowledge, our method generates multi-marker models relevant to the biological process underlying a complex trait for which genotype data is available. We tested the applicability of our method using the WTCCC case-control dataset. Analyzing a number of biological pathways, the method was able to identify several immune system related multi-SNP models significantly associated with Rheumatoid Arthritis (RA) and Crohn’s disease (CD). RA-associated multi-SNP models were also replicated in an independent case-control dataset. The method we present provides a framework for capturing joint contributions of genetic factors to complex traits. In contrast to hypothesis-free approaches, its results can be given a direct biological interpretation. The replicated multi-SNP models generated by our analysis may serve as a predictor to estimate the risk of RA development in individuals of Caucasian ancestry.

[1]  Jason H. Moore,et al.  Exploiting the proteome to improve the genome-wide genetic analysis of epistasis in common human diseases , 2008, Human Genetics.

[2]  Jason H. Moore,et al.  BIOINFORMATICS REVIEW , 2005 .

[3]  R. Martienssen,et al.  Copying out our ABCs: the role of gene redundancy in interpreting genetic hierarchies. , 1999, Trends in genetics : TIG.

[4]  T. Sasazuki,et al.  Early-onset pauciarticular juvenile chronic arthritis is associated with a mutation in the Y-box of the HLA-DQA1 promoter. , 1995, Tissue antigens.

[5]  Jason H. Moore,et al.  The Ubiquitous Nature of Epistasis in Determining Susceptibility to Common Human Diseases , 2003, Human Heredity.

[6]  H. Cordell Detecting gene–gene interactions that underlie human diseases , 2009, Nature Reviews Genetics.

[7]  D. Clayton,et al.  Genome-wide association study and meta-analysis finds over 40 loci affect risk of type 1 diabetes , 2009, Nature Genetics.

[8]  Greg Gibson,et al.  Rare and common variants: twenty arguments , 2012, Nature Reviews Genetics.

[9]  N Risch,et al.  The Future of Genetic Studies of Complex Human Diseases , 1996, Science.

[10]  Sara Marsal,et al.  Genome-wide association study of rheumatoid arthritis in the Spanish population: KLF12 as a risk locus for rheumatoid arthritis susceptibility. , 2008, Arthritis and rheumatism.

[11]  Angelo Nuzzo,et al.  Genephony: a knowledge management tool for genome-wide research , 2009, BMC Bioinformatics.

[12]  Tariq Ahmad,et al.  Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci , 2010, Nature Genetics.

[13]  R. Sen,et al.  Faculty Opinions recommendation of REL, encoding a member of the NF-kappaB family of transcription factors, is a newly defined risk locus for rheumatoid arthritis. , 2009 .

[14]  Anbupalam Thalamuthu,et al.  TRAF1-C5 as a risk locus for rheumatoid arthritis--a genomewide study. , 2007, The New England journal of medicine.

[15]  Jason H. Moore,et al.  A Cellular Automata Approach to Detecting Interactions Among Single-nucleotide Polymorphisms in Complex Multifactorial Diseases , 2001, Pacific Symposium on Biocomputing.

[16]  P. Visscher,et al.  Common polygenic variation contributes to risk of schizophrenia and bipolar disorder , 2009, Nature.

[17]  N. Risch Searching for genetic determinants in the new millennium , 2000, Nature.

[18]  P. Visscher,et al.  Common SNPs explain a large proportion of heritability for human height , 2011 .

[19]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[20]  Larry J. Eshelman,et al.  The CHC Adaptive Search Algorithm: How to Have Safe Search When Engaging in Nontraditional Genetic Recombination , 1990, FOGA.

[21]  Jurg Ott,et al.  20 Applications of neural networks for gene finding , 2001 .

[22]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[23]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[24]  David N Cooper,et al.  GWAS: heritability missing in action? , 2010, European Journal of Human Genetics.

[25]  Susumu Goto,et al.  KEGG for representation and analysis of molecular networks involving diseases and drugs , 2009, Nucleic Acids Res..

[26]  R. Sartor Mechanisms of Disease: pathogenesis of Crohn's disease and ulcerative colitis , 2006, Nature Clinical Practice Gastroenterology &Hepatology.

[27]  Greg Gibson,et al.  Decanalization and the origin of complex disease , 2009, Nature Reviews Genetics.

[28]  J. Micol,et al.  Understanding synergy in genetic interactions. , 2009, Trends in genetics : TIG.

[29]  S. Sinha Introduction to Bayesian Statistics (2nd ed.) , 2008 .

[30]  Jason H. Moore,et al.  Genetic programming neural networks: A powerful bioinformatics tool for human genetics , 2007, Appl. Soft Comput..

[31]  Marylyn D. Ritchie,et al.  GPNN: Power studies and applications of a neural network method for detecting gene-gene interactions in studies of human disease , 2006, BMC Bioinformatics.

[32]  Jing Cui,et al.  Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci , 2010, Nature Genetics.

[33]  D. Allison,et al.  Detection of gene x gene interactions in genome-wide association studies of human population data. , 2007, Human heredity.

[34]  E. Lander,et al.  The mystery of missing heritability: Genetic interactions create phantom heritability , 2012, Proceedings of the National Academy of Sciences.

[35]  K. Koch Introduction to Bayesian Statistics , 2007 .

[36]  Peter Donnelly,et al.  Progress and challenges in genome-wide association studies in humans , 2008, Nature.

[37]  Angelo Nuzzo,et al.  An automated reasoning framework for translational research , 2010, J. Biomed. Informatics.

[38]  Lloyd D. Fisher,et al.  Biostatistics: A Methodology for the Health Sciences , 1993 .

[39]  P. Rosenberg,et al.  Pathway analysis by adaptive combination of P‐values , 2009, Genetic epidemiology.

[40]  L. Palmer,et al.  Genomewide scans of complex human diseases: true linkage is hard to find. , 2001, American journal of human genetics.

[41]  C. Richez,et al.  [Pathophysiology of rheumatoid arthritis]. , 2012, La Revue du praticien.

[42]  Elizabeth A. Heron,et al.  The SNP ratio test: pathway analysis of genome-wide association datasets , 2009, Bioinform..

[43]  K. Mitchell,et al.  What is complex about complex disorders? , 2012, Genome Biology.

[44]  John R Thompson,et al.  Biostatistical Aspects of Genome‐Wide Association Studies , 2008, Biometrical journal. Biometrische Zeitschrift.

[45]  R. de Vries,et al.  Genetics of rheumatoid arthritis: time for a change! , 2011, Current opinion in rheumatology.

[46]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[47]  Larry W. Moreland,et al.  REL, a member of the NF-κB family of transcription factors, is a newly defined risk locus for rheumatoid arthritis , 2009, Nature Genetics.

[48]  R. Eeles,et al.  Genome-wide association studies in cancer. , 2008, Human molecular genetics.

[49]  Darrell Whitley,et al.  A genetic algorithm tutorial , 1994, Statistics and Computing.

[50]  R. Wu,et al.  A General Model for Multilocus Epistatic Interactions in Case-Control Studies , 2010, PloS one.

[51]  J. Ott,et al.  Applications of neural networks for gene finding. , 2001, Advances in genetics.

[52]  Ronnie Driver,et al.  Biostatistics: a Methodology for the Health Sciences , 2005 .

[53]  Momiao Xiong,et al.  Gene and pathway-based second-wave analysis of genome-wide association studies , 2010, European Journal of Human Genetics.