Feature selection in Bayesian classifiers for the prognosis of survival of cirrhotic patients treated with TIPS

The transjugular intrahepatic portosystemic shunt (TIPS) is a treatment for cirrhotic patients with portal hypertension. A subgroup of patients dies in the first 6 months and another subgroup lives a long period of time. Nowadays, no risk factors have been identified in order to determine how long a patient will survive. An empirical study for predicting the survival rate within the first 6 months after TIPS placement is conducted using a clinical database with 107 cases and 77 variables. Applications of Bayesian classification models, based on Bayesian networks, to medical problems have become popular in the last years. Feature subset selection is useful due to the heterogeneity of the medical databases where not all the variables are required to perform the classification. In this paper, filter and wrapper approaches based on the feature subset selection are adapted to induce Bayesian classifiers (naive Bayes, selective naive Bayes, semi naive Bayes, tree augmented naive Bayes, and k-dependence Bayesian classifier) and are applied to distinguish between the two subgroups of cirrhotic patients. The estimated accuracies obtained tally with the results of previous studies. Moreover, the medical significance of the subset of variables selected by the classifiers along with the comprehensibility of Bayesian models is greatly appreciated by physicians.

[1]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[2]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[3]  Lucila Ohno-Machado,et al.  A Comparison of Machine Learning Methods for the Diagnosis of Pigmented Skin Lesions , 2001, J. Biomed. Informatics.

[4]  M. Stone,et al.  Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[5]  José Manuel Gutiérrez,et al.  Expert Systems and Probabiistic Network Models , 1996 .

[6]  D. Hand,et al.  Idiot's Bayes—Not So Stupid After All? , 2001 .

[7]  A. Dale Thomas Bayes, An essay towards solving a problem in the doctrine of chances (1764) , 2005 .

[8]  Ivan Bratko,et al.  Experiments in automatic learning of medical diagnostic rules , 1984 .

[9]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[10]  Michael J. Pazzani,et al.  Searching for Dependencies in Bayesian Classifiers , 1995, AISTATS.

[11]  Jerzy Stefanowski,et al.  Feature subset selection for classification of histological images , 1997, Artif. Intell. Medicine.

[12]  J. Krige,et al.  Management of oesophageal varices , 1994, The Lancet.

[13]  Eamonn J. Keogh,et al.  Learning augmented Bayesian classifiers: A comparison of distribution-based and classification-based approaches , 1999, AISTATS.

[14]  N. Chalasani,et al.  Determinants of mortality in patients with advanced cirrhosis after transjugular intrahepatic portosystemic shunting. , 2000, Gastroenterology.

[15]  W. Hislop,et al.  A 20-year prospective study of cirrhosis. , 1981, British medical journal.

[16]  Constantin F. Aliferis,et al.  An evaluation of machine-learning methods for predicting pneumonia mortality , 1997, Artif. Intell. Medicine.

[17]  Konrad Lang,et al.  Evaluation of automatic knowledge acquisition techniques in the diagnosis of acute abdominal pain - Acute Abdominal Pain Study Group , 1996, Artif. Intell. Medicine.

[18]  Elvira: An Environment for Creating and Using Probabilistic Graphical Models , 2002, Probabilistic Graphical Models.

[19]  Jason Catlett,et al.  On Changing Continuous Attributes into Ordered Discrete Attributes , 1991, EWSL.

[20]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[21]  P. Kamath,et al.  A model to predict poor survival in patients undergoing transjugular intrahepatic portosystemic shunts , 2000, Hepatology.

[22]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[23]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[24]  Krzysztof J. Cios,et al.  Uniqueness of medical data mining , 2002, Artif. Intell. Medicine.

[25]  Dimitris Fouskakis,et al.  A Case Study of Stochastic Optimization in Health Policy: Problem Formulation and Preliminary Results , 2000, J. Glob. Optim..

[26]  H O Conn,et al.  A peek at the child‐turcotte classification , 1981, Hepatology.

[27]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[28]  Enrique F. Castillo,et al.  Expert Systems and Probabilistic Network Models , 1996, Monographs in Computer Science.

[29]  Pedro Larrañaga,et al.  Feature subset selection by genetic algorithms and estimation of distribution algorithms - A case study in the survival of cirrhotic patients treated with TIPS , 2001, Artif. Intell. Medicine.

[30]  D. Shafritz,et al.  Identification of integrated hepatitis B virus DNA sequences in human hepatocellular carcinomas , 1981, Hepatology.

[31]  S. Vinterbo Predictive Models in Medicine: Some Methods for Construction and Adaptation , 2000 .

[32]  Igor Kononenko,et al.  Semi-Naive Bayesian Classifier , 1991, EWSL.

[33]  Marvin Minsky,et al.  Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[34]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[35]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[36]  R. Kronmal,et al.  The effect of assuming independence in applying Bayes' theorem to risk estimation and classification in diagnosis. , 1983, Computers and biomedical research, an international journal.

[37]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[38]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[39]  R. Pugh,et al.  Transection of the oesophagus for bleeding oesophageal varices , 1973, The British journal of surgery.

[40]  C A Kulikowski,et al.  A comparison of methods for the automated diagnosis of thyroid dysfunction. , 1971, Computers and biomedical research, an international journal.

[41]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[42]  Mehran Sahami,et al.  Learning Limited Dependence Bayesian Classifiers , 1996, KDD.