Novel analytical methods applied to type 1 diabetes genome-scan data.

Complex traits like type 1 diabetes mellitus (T1DM) are generally taken to be under the influence of multiple genes interacting with each other to confer disease susceptibility and/or protection. Although novel methods are being developed, analyses of whole-genome scans are most often performed with multipoint methods that work under the assumption that multiple trait loci are unrelated to each other; that is, most models specify the effect of only one locus at a time. We have applied a novel approach, which includes decision-tree construction and artificial neural networks, to the analysis of T1DM genome-scan data. We demonstrate that this approach (1) allows identification of all major susceptibility loci identified by nonparametric linkage analysis, (2) identifies a number of novel regions as well as combinations of markers with predictive value for T1DM, and (3) may be useful in characterizing markers in linkage disequilibrium with protective-gene variants. Furthermore, the approach outlined here permits combined analyses of genetic-marker data and information on environmental and clinical covariates.

[1]  Amanda J. Wilson,et al.  A search for type 1 diabetes susceptibility genes in families from the United Kingdom , 1998, Nature Genetics.

[2]  M. James,et al.  Genetic mapping of a susceptibility locus for insulin-dependent diabetes mellitus on chromosome llq , 1994, Nature.

[3]  D. Marquardt An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .

[4]  Nancy J. Cox,et al.  Loci on chromosomes 2 (NIDDM1) and 15 interact to increase susceptibility to diabetes in Mexican Americans , 1999, Nature Genetics.

[5]  J. Kere,et al.  Data mining applied to linkage disequilibrium mapping. , 2000, American journal of human genetics.

[6]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[7]  A. Schäffer,et al.  Linkage analyses in type I diabetes mellitus using CASPAR, a software and statistical program for conditional analysis of polygenic diseases. , 1997, Human heredity.

[8]  L. Almasy,et al.  Multipoint oligogenic linkage analysis of quantitative traits , 1997, Genetic epidemiology.

[9]  F. Pociot,et al.  Genetics of type 1 diabetes mellitus , 2002, Genes and Immunity.

[10]  J. H. Moore,et al.  Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. , 2001, American journal of human genetics.

[11]  Paul S. Bradley,et al.  Mathematical Programming for Data Mining: Formulations and Challenges , 1999, INFORMS J. Comput..

[12]  J. Tuomilehto,et al.  Incidence of childhood type 1 diabetes worldwide. Diabetes Mondiale (DiaMond) Project Group. , 2000, Diabetes care.

[13]  D Curtis,et al.  Use of an artificial neural network to detect association between a disease and multiple marker genotypes , 2001, Annals of human genetics.

[14]  N J Cox,et al.  Seven regions of the genome show evidence of linkage to type 1 diabetes in a consensus analysis of 767 multiplex families. , 2001, American journal of human genetics.

[15]  D. Siegmund,et al.  Statistical methods for linkage analysis of complex traits from high-resolution maps of identity by descent. , 1995, Genetics.

[16]  Sara A. Solla,et al.  Multi-Locus Nonparametric Linkage Analysis of Complex Trait Loci with Neural Networks , 1998, Human Heredity.

[17]  Daniel E. Weeks,et al.  The Complexity of Linkage Analysis with Neural Networks , 2001, Human Heredity.

[18]  F. Pociot,et al.  Genetic susceptibility markers in Danish patients with type 1 (insulin-dependent) diabetes : evidence for polygenecity in man , 1994 .

[19]  J. Todd,et al.  A genome-wide search for human type 1 diabetes susceptibility genes , 1994, Nature.

[20]  E. Thorsby,et al.  HLA complex genes in type 1 diabetes and other autoimmune diseases. Which genes are involved? , 2001, Trends in genetics : TIG.

[21]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[22]  M. Farrall Affected sibpair linkage tests for multiple linked susceptibility genes , 1997, Genetic epidemiology.

[23]  F. Pociot,et al.  A genomewide scan for type 1-diabetes susceptibility in Scandinavian families: identification of new loci with evidence of interactions. , 2001, American journal of human genetics.

[24]  L. Palmer,et al.  Genomewide scans of complex human diseases: true linkage is hard to find. , 2001, American journal of human genetics.

[25]  J. Ott,et al.  Neural network analysis of complex traits , 1997, Genetic epidemiology.

[26]  J. Tuomilehto,et al.  Worldwide increase in incidence of Type I diabetes – the analysis of the data on published incidence trends , 1999, Diabetologia.

[27]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[28]  P. Kemmeren,et al.  A new web-based data mining tool for the identification of candidate genes for human genetic disorders , 2003, European Journal of Human Genetics.

[29]  M. Eisen,et al.  Gene expression informatics —it's all in your mine , 1999, Nature Genetics.

[30]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[31]  J. Todd,et al.  Genetic protection from the inflammatory disease type 1 diabetes in humans and animal models. , 2001, Immunity.

[32]  Peter L. Hammer,et al.  Use of the Logical Analysis of Data Method for Assessing Long-Term Mortality Risk After Exercise Electrocardiography , 2002, Circulation.

[33]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[34]  P Flodman,et al.  Preliminary Implementation of New Data Mining Techniques for the Analysis of Simulation Data from Genetic Analysis Workshop 12: Problem 2 , 2001, Genetic epidemiology.

[35]  P. Bork,et al.  Association of genes to genetically inherited diseases using data mining , 2002, Nature Genetics.

[36]  J. Nerup,et al.  Increased risk of childhood type 1 diabetes in children born after 1985. , 2002, Diabetes care.

[37]  Bill C White,et al.  Optimization of neural network architecture using genetic programming improves detection and modeling of gene-gene interactions in studies of human diseases , 2003, BMC Bioinformatics.

[38]  N. Risch,et al.  A second-generation screen of the human genome for susceptibility to insulin-dependent diabetes mellitus , 1998, Nature Genetics.

[39]  J. Ott,et al.  Sometimes it's hot, sometimes it's not , 1998, Nature Genetics.

[40]  C. Sing,et al.  A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation. , 2001, Genome research.

[41]  J Ott,et al.  Analysis of complex traits using neural networks , 1999, Genetic epidemiology.