论文信息 - Finding Relevant SNP Sets and Predicting Disease Risk Using Simulated Annealing

Finding Relevant SNP Sets and Predicting Disease Risk Using Simulated Annealing

We applied simulated annealing algorithm and decision tree to find set of single nucleotide polymorphisms relevant to a disease and build a risk prediction model. For time complexity problem of simulated annealing caused by initial set and candidate generation, we constructed an initial set of the variants by fast heuristic algorithm and proposed a transition rules based on contribution of available variants. The experiment results show that we can obtain new set of variants with the reduced number of variants and the improved prediction performance compared to others by traditional feature selection algorithms.

[1] Alan J. Miller,et al. Subset Selection in Regression , 1991 .

[2] 김삼묘,et al. “Bioinformatics” 특집을 내면서 , 2000 .

[3] Nils J. Nilsson,et al. Artificial Intelligence , 1974, IFIP Congress.

[4] Thomas G. Dietterich,et al. Readings in Machine Learning , 1991 .

[5] A. Atkinson. Subset Selection in Regression , 1992 .

[6] Ron Kohavi,et al. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[7] J. Cavanaugh. Biostatistics , 2005, Definitions.

[8] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[9] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[10] M. Lawera. Predictive inference : an introduction , 1995 .

[11] Evgueni A. Haroutunian,et al. Information Theory and Statistics , 2011, International Encyclopedia of Statistical Science.

[12] Seymour Geisser,et al. 8. Predictive Inference: An Introduction , 1995 .