论文信息 - Classifying Y-Short Tandem Repeat Data: A Decision Tree Approach

Classifying Y-Short Tandem Repeat Data: A Decision Tree Approach

Classifying Y-Short Tandem Repeat data has recently been introduced in supervised and unsupervised classifications. This study continues the efforts in classifying YSTR data based on four decision tree models: CHisquared Automatic Interaction Detection (CHAID), Classification and Regression Tree (CART), Quick, Unbiased, Efficient Statistical Tree (QUEST) and C5. A data mining tool, called IBM Statistical Package for the Science Social Modeler 15.0 (IBM® SPSS® Modeler 15) was used for evaluating the performances of the models over six Y-STR data. Overall results showed that the decision tree models were able to classify all six Y-STR data significantly. Among the four models, C5 is the most consistent modelm where it had produced the highest accuracy score of 91.85%, sensitivity score of 93.69% and specificity score of 96.32%.

Zainab Abu Bakar | Ali Seman | Azizian Mohd Sapawi | Ida Rosmini Othman

[1] Growing the Family Tree: The Power of DNA in Reconstructing Family Relationships , 2004 .

[2] Z. Bakar,et al. An efficient clustering algorithm for partitioning Y-short tandem repeats data , 2012, BMC Research Notes.

[3] Ji Hoon Kang,et al. Association Rule Mining and Network Analysis in Oriental Medicine , 2013, PloS one.

[4] Joseph Schlecht,et al. Machine-Learning Approaches for Classifying Haplogroup from Y Chromosome STR Data , 2008, PLoS Comput. Biol..

[5] Yap Bee Wah,et al. Using data mining predictive models to classify credit card applicants , 2010, 2010 6th International Conference on Advanced Information Management and Service (IMS).

[6] Sujin Kim,et al. Content analysis of cancer blog posts. , 2009, Journal of the Medical Library Association : JMLA.

[7] S. Hummel,et al. Reconstruction of a historical genealogy by means of STR analysis and Y-haplotyping of ancient DNA , 1999, European Journal of Human Genetics.

[8] R. Fimmers,et al. Paternity testing using Y-STR haplotypes: assigning a probability for paternity in cases of mutations , 2001, International Journal of Legal Medicine.

[9] G. Stix. Traces of a distant past. , 2008, Scientific American.