A Comparative Study of Medical Data Classification Methods Based on Decision Tree and System Reconstruction Analysis

This paper studies medical data classification methods, comparing decision tree and system reconstruction analysis as applied to heart disease medical data mining. The data we study is collected from patients with coronary heart disease. It has 1,723 records of 71 attributes each. We use the system-reconstruction method to weight it. We use decision tree algorithms, such as induction of decision trees (ID3), classification and regression tree (C4.5), classification and regression tree (CART), Chi-square automatic interaction detector (CHAID), and exhausted CHAID. We use the results to compare the correction rate, leaf number, and tree depth of different decision-tree algorithms. According to the experiments, we know that weighted data can improve the correction rate of coronary heart disease data but has little effect on the tree depth and leaf number.

[1]  Yun-Chia Liang,et al.  A Variable Neighbourhood Descent Algorithm for the Redundancy Allocation Problem , 2005 .

[2]  Y. Alp Aslandogan,et al.  Evidence combination in medical data mining , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[3]  Bush Jones A PROGRAM FOR RECONSTRUCTABILUY ANALYSIS , 1989 .

[4]  R. Cavallo,et al.  RECONSTRUCTABILITY† ANALYSIS: Evaluation of Reconstruction Hypotheses‡ , 1981 .

[5]  G. Klir,et al.  RECONSTRUCTABILITY ANALYSIS OF MULTI-DIMENSIONAL RELATIONS: A Theoretical Basis for Computer-Aided Determination of Acceptable Systems Models † , 1979 .

[6]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[7]  Ashby Wr,et al.  Constraint analysis of many-dimensional relations. , 1965 .

[8]  R. Clarke,et al.  Use of classification and regression trees (CART) to classify remotely-sensed digital images , 2003, IGARSS 2003. 2003 IEEE International Geoscience and Remote Sensing Symposium. Proceedings (IEEE Cat. No.03CH37477).

[9]  Guangfu Shu,et al.  APPLICATION OF SYSTEM RECONSTRUCTABILITY ANALYSIS ON RESEARCH OF RELATIONS BETWEEN PRICE AND FINANCIAL-ECONOMIC FACTORS , 2000 .

[10]  David F. Lobach,et al.  Medical data mining: knowledge discovery in a clinical data warehouse , 1997, AMIA.

[11]  J.-S.R. Jang,et al.  Structure determination in fuzzy modeling: a fuzzy CART approach , 1994, Proceedings of 1994 IEEE 3rd International Fuzzy Systems Conference.

[12]  W. Ashby,et al.  Constraint analysis of many-dimensional relations. , 1965, Progress in biocybernetics.

[13]  G. Klir IDENTIFICATION OF GENERATIVE STRUCTURES IN EMPIRICAL DATA , 1976 .

[14]  J. Ross Quinlan,et al.  Learning Efficient Classification Procedures and Their Application to Chess End Games , 1983 .

[15]  Gonzalo Martínez-Muñoz,et al.  Using all data to generate decision tree ensembles , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[16]  Alex Berson,et al.  Data Warehousing, Data Mining, and OLAP , 1997 .