Using Classification Tree and Logistic Regression Methods to Diagnose Myocardial Infarction

Early and accurate diagnosis of myocardial infarction (MI) in patients who present to the Emergency Room (ER) complaining of chest pain is an important problem in emergency medicine. A number of decision aids have been developed to assist with this problem but have not achieved general use. Machine learning techniques, including classification tree and logistic regression (LR) methods, have the potential to create simple but accurate decision aids. Both a classification tree (FT Tree) and an LR model (FT LR) have been developed to predict the probability that a patient with chest pain is having an MI based solely upon data available at time of presentation to the ER. Training data came from a data set collected in Edinburgh, Scotland. Each model was then tested on a separate Edinburgh data set, as well as on a data set from a different hospital in Sheffield, England. Previously published models, the Goldman classification tree[1] and Kennedy LR equation[2], were evaluated on the same test data sets. On the Edinburgh test set, results showed that the FT Tree, FT LR, and Kennedy LR performed equally well, with ROC curve areas of 94.04%, 94.28%, and 94.30%, respectively, while the Goldman Tree's performance was significantly poorer, with an area of 84.03%. The difference in ROC areas between the first three models and the Goldman model is significant beyond the 0.0001 level. On the Sheffield test set, results showed that the FT Tree, FT LR, and Kennedy LR ROC areas were not significantly different (p > = 0.17), while the FT Tree again outperformed the Goldman Tree (p = 0.006). Unlike previous work[3], this study indicates that classification trees, which have certain advantages over LR models, may perform as well as LR models in the diagnosis of patients with MI.

[1]  K. Liestøl,et al.  Prospective evaluation of an EDB-based diagnostic program to be used in patients admitted to hospital with acute chest pain. , 1993, European heart journal.

[2]  R. Harrison,et al.  Early diagnosis of acute myocardial infarction using clinical and electrocardiographic data at presentation: derivation and evaluation of logistic regression models. , 1996, European heart journal.

[3]  R B D'Agostino,et al.  A comparison of logistic regression to decision-tree induction in a medical domain. , 1993, Computers and biomedical research, an international journal.

[4]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[5]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[6]  R. D'Agostino,et al.  A comparison of performance of mathematical predictive methods for medical diagnosis: identifying acute cardiac ischemia among emergency department patients. , 1995, Journal of investigative medicine : the official publication of the American Federation for Clinical Research.

[7]  Jeffrey A. Stem,et al.  A computer-derived protocol to aid in the diagnosis of emergency room patients with acute chest pain. , 1982, The New England journal of medicine.

[8]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[9]  W. Baxt Use of an artificial neural network for the diagnosis of myocardial infarction. , 1991, Annals of internal medicine.