A predictive model for cerebrovascular disease using data mining

Cerebrovascular disease has been ranked the second or third of top 10 death causes in Taiwan and has caused about 13,000 people death every year since 1986. Once cerebrovascular disease occurs, it not only leads to huge cost of medical care, but even death. All developed countries in the world put cerebrovascular disease prevention and treatment in high priority, and invested considerable budget and human resource in long-term studies, in order to reduce the heavy burden. As the pathogenesis of cerebrovascular disease is complex and variable, it is hard to make accurate diagnosis in advance. However, in perspective of preventive medicine, it is necessary to build a predictive model to enhance the accurate diagnosis of cerebrovascular disease. Therefore, coupled with the 2007 cerebrovascular disease prevention and treatment program of a regional teaching hospital in Taiwan, this study aimed to apply the classification technology to construct an optimum cerebrovascular disease predictive model. From this predictive model, cerebrovascular disease classification rules were extracted and used to improve the diagnosis and prediction of cerebrovascular disease. This study acquired 493 valid samples from this cerebrovascular disease prevention and treatment program, and adopted three classification algorithms, decision tree, Bayesian classifier and back propagation neural network, to construct classification models, respectively. After analyzing and comparing classification efficiencies - sensitivity and accuracy, the decision tree constructed model was chosen as the optimum predictive model for cerebrovascular disease. In this model, the sensitivity and accuracy were 99.48% and 99.59%, respectively, and eight important influence factors of predicting cerebrovascular disease and 16 diagnosis classification rules were extracted. Five experienced cerebrovascular doctors assessed these rules, and confirmed them to be useful to the current clinical medical condition.

[1]  Herman J. Loether,et al.  Descriptive and inferential statistics: An introduction , 1980 .

[2]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[4]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[5]  Philip Raskin,et al.  The treatment of hypertension in adult patients with diabetes. , 2002, Diabetes care.

[6]  Elif Derya Übeyli,et al.  Neural network analysis of internal carotid arterial Doppler signals: predictions of stenosis and occlusion , 2003, Expert Syst. Appl..

[7]  Francisco Cervantes-Pérez,et al.  Using neural networks for differential diagnosis of Alzheimer disease and vascular dementia , 1998 .

[8]  Mevlut Ture,et al.  Comparing classification techniques for predicting essential hypertension , 2005, Expert Syst. Appl..

[9]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[10]  Alvaro L. Ronco,et al.  Use of artificial neural networks in modeling associations of discriminant factors: towards an intelligent selective breast cancer screening , 1999, Artif. Intell. Medicine.

[11]  Carmen Suárez,et al.  Baseline Characteristics of Patients with Cerebrovascular Disease in the REACH Registry: The Spanish Contribution , 2007, Cerebrovascular Diseases.

[12]  Igor Kononenko,et al.  Analysing and improving the diagnosis of ischaemic heart disease with machine learning , 1999, Artif. Intell. Medicine.

[13]  Fu-Ren Lin,et al.  Mining time dependency patterns in clinical pathways , 2001, Int. J. Medical Informatics.

[14]  G. V. Kass An Exploratory Technique for Investigating Large Quantities of Categorical Data , 1980 .

[15]  C K Francis,et al.  New staging system of the fifth Joint National Committee report on the detection, evaluation, and treatment of high blood pressure (JNC-V) alters assessment of the severity and treatment of hypertension. , 1996, Hypertension.

[16]  Byoung-Tak Zhang,et al.  AptaCDSS-E: A classifier ensemble-based clinical decision support system for cardiovascular disease level prediction , 2008, Expert Syst. Appl..

[17]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[18]  Jill L. King,et al.  Computer-assisted diagnosis of breast cancer using a data-driven Bayesian belief network , 1999, Int. J. Medical Informatics.

[19]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[20]  Karim K. Hirji,et al.  Discovering data mining: from concept to implementation , 1999, SKDD.

[21]  Richard Barnett,et al.  Diabetes mellitus. , 1993, The Medical journal of Australia.

[22]  P.E. Maher,et al.  Uncertain reasoning in an ID3 machine learning framework , 1993, [Proceedings 1993] Second IEEE International Conference on Fuzzy Systems.

[23]  Lawrence O. Hall,et al.  Decision trees work better than feed-forward back-prop neural nets for a specific class of problems , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[24]  Chih-Hao Chen,et al.  Applying decision tree and neural network to increase quality of dermatologic diagnosis , 2009, Expert Syst. Appl..

[25]  Benjamin Van Roy,et al.  Solving Data Mining Problems Through Pattern Recognition , 1997 .

[26]  M. Cevdet Ince,et al.  An expert system for detection of breast cancer based on association rules and neural network , 2009, Expert Syst. Appl..

[27]  Kristian Kersting,et al.  Analysis of respiratory pressure-volume curves in intensive care medicine using inductive machine learning , 2002, Artif. Intell. Medicine.

[28]  A Tremblay,et al.  Waist circumference and abdominal sagittal diameter: best simple anthropometric indexes of abdominal visceral adipose tissue accumulation and related cardiovascular risk in men and women. , 1994, The American journal of cardiology.

[29]  Igor Kononenko,et al.  Inductive and Bayesian learning in medical diagnosis , 1993, Appl. Artif. Intell..

[30]  F Sainfort,et al.  Measuring quality of care in psychiatric emergencies: construction and evaluation of a Bayesian index. , 1993, Health services research.