An application of a genetic algorithm in conjunction with other data mining methods for estimating outcome after hospitalization in cancer patients.

BACKGROUND We investigated which factors predicted the risk of in-hospital mortality in a general population of cancer patients with non-terminal disease and whether employing the genetic algorithm technique would be useful in this regard. MATERIAL/METHODS A total of 201 cancer patients, including all cases of in-hospital mortality over a 2-year period, as well as a control group of subjects discharged during the same period, all having an Eastern Cooperative Oncology Group (ECOG) performance status of of < or =3 at the time of admission, were retrospectively evaluated. Indicators of in-hospital mortality were determined by multivariate logistic regression, recursive partitioning analysis, neural network, and genetic algorithm (GA) techniques. The performance of the different techniques were compared by a number of measures, including receiver operating curve (ROC) analysis. RESULTS All four analysis methods selected a combination of six explanatory variables to explain the risk of in-hospital mortality: lactate dehydrogenase (LDH), alanine transaminase (ALT), hemoglobin (Hb), white blood cell counts (Wbc), type of cancer, and reason for admission. Compared with the other 3 methods, GA selected the least number of explanatory variables, i.e. LDH and reason for admission, with similar fraction of cases explained (78.6%), and yielded a fitness score of 0.52. CONCLUSIONS LDH is an important indicator of in-hospital mortality for hospitalized cancer patients not in terminal stage. GA reliably predicted in-hospital mortality and was shown to be as efficient as the other data mining techniques employed in this study. Its use in a clinical setting for prognostication in oncology appears promising.