R-C4.5 decision tree model and its applications to health care dataset

In this paper, a robust and practical decision tree improved model R-C4.5 and its simplified version are introduced. This model is based on C4.5 and improved efficiently on attribution selection and partitioning methods. R-C4.5 decision tree model avoids the appearance of fragmentation by uniting the branches which have poor classified effect. The simplified version of R-C4.5 model is implemented in data preprocessing. The experiments show that R-C4.5 and the simplified version enhance the interpretability of splitting attribute selection, reduce the numbers of insignificant or empty branches and avoid the appearance of over fitting. This paper focuses on applying the improved R-C4.5 decision tree model to the research on health care to predict inpatient length of stay. The result can be understood and accepted better by managers. It can also help health care organizations to arrange and make full use of hospital resources.