Improving medical decision trees by combining relevant health-care criteria

Through the years, decision trees have been widely used both to represent and to conduct decision processes. They can be automatically induced from databases using supervised learning algorithms which usually aim at minimizing the size of the tree. When inducing decision trees in a medical setting, the induction process should consider the background knowledge used by health-care professionals to make decisions in order to produce decision trees that are medically and clinically comprehensible and correct. Comprehensibility measures the medical coherence of the sequence of questions represented in the tree, and correctness rates how much irrelevant are the errors of the decision tree from a medical or clinical point of view. Some algorithms partially solve these problems pursuing alternative objectives as reducing the economic cost or improving the adherence of the decision process to medical standards. However, from a clinical point of view, none of these criteria is valid when it is considered alone, because real medical decisions are taken attending to a combination of them, and also other health-care criteria, simultaneously. Moreover, this combination of criteria is not static and may vary if the decision tree is made for different purposes as screening, diagnosing, prognosing or drug and therapy prescription. In this paper, a decision tree induction algorithm that uses combinations of health-care criteria is presented and used to generate decision trees for screening and diagnosing in four medical domains. The mechanisms to formalize and to combine these criteria are also presented. The results have been analyzed from both a statistical and a medical point of view, and they suggest that our algorithm obtains decision trees that physicians evaluated as more comprehensible and correct than the decision trees obtained by previous approaches as they keep an equivalent accuracy.

[1]  David Riaño,et al.  Inducing Decision Trees from Medical Decision Processes , 2010, KR4HC.

[2]  Peter D. Turney Types of Cost in Inductive Concept Learning , 2002, ArXiv.

[3]  David Riaño,et al.  Increasing Acceptability of Decision Trees with Domain Attributes Partial Orders , 2007, Twentieth IEEE International Symposium on Computer-Based Medical Systems (CBMS'07).

[4]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[5]  Ching-Hsue Cheng,et al.  A predictive model for cerebrovascular disease using data mining , 2011, Expert Syst. Appl..

[6]  Peter J. F. Lucas,et al.  Bayesian Network Decomposition for Modeling Breast Cancer Detection , 2007, AIME.

[7]  Qiang Yang,et al.  Test-cost sensitive naive Bayes classification , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[8]  Peter J. F. Lucas,et al.  Bayesian networks in biomedicine and health-care , 2004, Artif. Intell. Medicine.

[9]  David Riaño,et al.  Automatic generation of clinical algorithms within the state-decision-action model , 2012, Expert Syst. Appl..

[10]  Ioan Dumitrache,et al.  Medicine expert system dynamic Bayesian Network and ontology based , 2011, Expert Syst. Appl..

[11]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[12]  W. Doucette,et al.  Adherence to Clinical Practice Guidelines for 7 Chronic Conditions in Long-term-Care Patients Who , 2007 .

[13]  Richard N. Shiffman,et al.  Model Formulation: Representation of Clinical Practice Guidelines in Conventional and Augmented Decision Tables , 1997, J. Am. Medical Informatics Assoc..

[14]  Samson W. Tu,et al.  Mining Hospital Data to Learn SDA* Clinical Algorithms , 2007, K4CARE.

[15]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[16]  Vili Podgorelec,et al.  Decision Trees: An Overview and Their Use in Medicine , 2002, Journal of Medical Systems.

[17]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[18]  Natalia Bochkina Probabilistic Modeling in Bioinformatics and Medical Informatics by D. Husmeier, R. Dybowski and S. Roberts (eds) , 2006 .

[19]  Qiang Yang,et al.  Test strategies for cost-sensitive decision trees , 2006, IEEE Transactions on Knowledge and Data Engineering.

[20]  Qiang Yang,et al.  Decision trees with minimal costs , 2004, ICML.