The application of a decision tree to establish the parameters associated with hypertension

INTRODUCTION Hypertension is an important risk factor for cardiovascular disease (CVD). The goal of this study was to establish the factors associated with hypertension by using a decision-tree algorithm as a supervised classification method of data mining. METHODS Data from a cross-sectional study were used in this study. A total of 9078 subjects who met the inclusion criteria were recruited. 70% of these subjects (6358 cases) were randomly allocated to the training dataset for the constructing of the decision-tree. The remaining 30% (2720 cases) were used as the testing dataset to evaluate the performance of decision-tree. Two models were evaluated in this study. In model I, age, gender, body mass index, marital status, level of education, occupation status, depression and anxiety status, physical activity level, smoking status, LDL, TG, TC, FBG, uric acid and hs-CRP were considered as input variables and in model II, age, gender, WBC, RBC, HGB, HCT MCV, MCH, PLT, RDW and PDW were considered as input variables. The validation of the model was assessed by constructing a receiver operating characteristic (ROC) curve. RESULTS The prevalence rates of hypertension were 32% in our population. For the decision-tree model I, the accuracy, sensitivity, specificity and area under the ROC curve (AUC) value for identifying the related risk factors of hypertension were 73%, 63%, 77% and 0.72, respectively. The corresponding values for model II were 70%, 61%, 74% and 0.68, respectively. CONCLUSION We have developed a decision tree model to identify the risk factors associated with hypertension that maybe used to develop programs for hypertension management.

[1]  Véronique A. Cornelissen,et al.  Effects of Endurance Training on Blood Pressure, Blood Pressure–Regulating Mechanisms, and Cardiovascular Risk Factors , 2005, Hypertension.

[2]  Eugene Lin,et al.  Pharmacogenomics of drug efficacy in the interferon treatment of chronic hepatitis C using classification algorithms. , 2010, Advances and applications in bioinformatics and chemistry : AABC.

[3]  K. Reynolds,et al.  Global burden of hypertension: analysis of worldwide data , 2005, The Lancet.

[4]  Nada Lavrac,et al.  Selected techniques for data mining in medicine , 1999, Artif. Intell. Medicine.

[5]  Jian‐Jun Li,et al.  Is hypertension an inflammatory disease? , 2005, Medical hypotheses.

[6]  Bernard C. Jiang,et al.  Using data mining techniques for multi-diseases prediction modeling of hypertension and hyperlipidemia by common risk factors , 2011, Expert Syst. Appl..

[7]  L. McHugh,et al.  The prevalence and clinical significance of nocturnal hypertension in pregnancy , 2001, Journal of hypertension.

[8]  S. Sachdeva,et al.  Endothelial dysfunction and inflammation in different stages of essential hypertension. , 2011, Saudi journal of kidney diseases and transplantation : an official publication of the Saudi Center for Organ Transplantation, Saudi Arabia.

[9]  Daniel W. Jones,et al.  The Seventh Report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure: the JNC 7 report. , 2003, JAMA.

[10]  Jan A Staessen,et al.  Cardiovascular protection and blood pressure reduction: a meta-analysis , 2001, The Lancet.

[11]  Marshala Lee,et al.  Risk factors of hypertension and correlates of blood pressure and mean arterial pressure among patients receiving health exams at the Preventive Medicine Clinic, King Chulalongkorn Memorial Hospital, Thailand. , 2006, Journal of the Medical Association of Thailand = Chotmaihet thangphaet.

[12]  Daniel W. Jones,et al.  Seventh report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure. , 2003, Hypertension.

[13]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[14]  M. Azarpazhooh,et al.  Mashhad stroke and heart atherosclerotic disorder (MASHAD) study: design, baseline characteristics and 10-year cardiovascular risk estimation , 2015, International Journal of Public Health.

[15]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[16]  Daniel W. Jones,et al.  Recommendations for blood pressure measurement in humans and experimental animals: Part 1: blood pressure measurement in humans: a statement for professionals from the Subcommittee of Professional and Public Education of the American Heart Association Council on High Blood Pressure Research. , 2005, Hypertension.

[17]  B. Akdağ,et al.  Determination of risk factors for hypertension through the classification tree method , 2006, Advances in therapy.

[18]  Ling Wang,et al.  Development and Evaluation of a Simple and Effective Prediction Approach for Identifying Those at High Risk of Dyslipidemia in Rural Adult Residents , 2012, PloS one.

[19]  B. Popkin,et al.  Ethnic differences in the association between body mass index and hypertension. , 2002, American journal of epidemiology.

[20]  Vili Podgorelec,et al.  Decision Trees: An Overview and Their Use in Medicine , 2002, Journal of Medical Systems.

[21]  Jamal Shahrabi,et al.  Applying decision tree for identification of a low risk population for type 2 diabetes. Tehran Lipid and Glucose Study. , 2014, Diabetes research and clinical practice.

[22]  B. Franklin,et al.  Exercise and Hypertension , 2004 .

[23]  Ws Lo,et al.  The relationship between hypertension and anxiety or depression in Hong Kong Chinese. , 2005, Experimental and clinical cardiology.

[24]  J. Kammerer,et al.  Tuberculosis transmission in nontraditional settings: a decision-tree approach. , 2005, American journal of preventive medicine.

[25]  C-reactive protein and the risk of developing hypertension. , 2003 .

[26]  A. Mainous,et al.  Elevation of C‐Reactive Protein in People With Prehypertension , 2004, Journal of clinical hypertension.

[27]  Mevlut Ture,et al.  Comparing classification techniques for predicting essential hypertension , 2005, Expert Syst. Appl..