Feature subset selection by genetic algorithms and estimation of distribution algorithms - A case study in the survival of cirrhotic patients treated with TIPS

The transjugular intrahepatic portosystemic shunt (TIPS) is an interventional treatment for cirrhotic patients with portal hypertension. In the light of our medical staff's experience, the consequences of TIPS are not homogeneous for all the patients and a subgroup dies in the first 6 months after TIPS placement. Actually, there is no risk indicator to identify this subgroup of patients before treatment. An investigation for predicting the survival of cirrhotic patients treated with TIPS is carried out using a clinical database with 107 cases and 77 attributes. Four supervised machine learning classifiers are applied to discriminate between both subgroups of patients. The application of several feature subset selection (FSS) techniques has significantly improved the predictive accuracy of these classifiers and considerably reduced the amount of attributes in the classification models. Among FSS techniques, FSS-TREE, a new randomized algorithm inspired on the new EDA (estimation of distribution algorithm) paradigm has obtained the best average accuracy results for each classifier.

[1]  Chi Hau Chen,et al.  Pattern recognition and signal processing , 1978 .

[2]  J. Kittler,et al.  Feature Set Search Alborithms , 1978 .

[3]  Ron Kohavi,et al.  Data Mining Using MLC a Machine Learning Library in C++ , 1996, Int. J. Artif. Intell. Tools.

[4]  H O Conn,et al.  A peek at the child‐turcotte classification , 1981, Hepatology.

[5]  Justin Doak,et al.  CSE-92-18 - An Evaluation of Feature Selection Methodsand Their Application to Computer Security , 1992 .

[6]  Jack Sklansky,et al.  On Automatic Feature Selection , 1988, Int. J. Pattern Recognit. Artif. Intell..

[7]  Dimitris Fouskakis,et al.  A Case Study of Stochastic Optimization in Health Policy: Problem Formulation and Preliminary Results , 2000, J. Glob. Optim..

[8]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[9]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[10]  D. Michie Personal models of rationality , 1990 .

[11]  H. A. Guvenir,et al.  A supervised machine learning algorithm for arrhythmia analysis , 1997, Computers in Cardiology 1997.

[12]  J. Saunders,et al.  A 20-year prospective study of cirrhosis. , 1981, British medical journal.

[13]  W Gerok,et al.  NEW NON-OPERATIVE TREATMENT FOR VARICEAL HAEMORRHAGE , 1989, The Lancet.

[14]  H. Mühlenbein,et al.  From Recombination of Genes to the Estimation of Distributions I. Binary Parameters , 1996, PPSN.

[15]  A Ochs,et al.  The first decade of the transjugular intrahepatic portosystemic shunt (TIPS): state of the art. , 2008, Liver.

[16]  Bojan Cestnik,et al.  Estimating Probabilities: A Crucial Task in Machine Learning , 1990, ECAI.

[17]  G. D’Amico,et al.  The treatment of portal hypertension: A meta‐analytic review , 1995, Hepatology.

[18]  P. Kamath,et al.  A model to predict poor survival in patients undergoing transjugular intrahepatic portosystemic shunts , 2000, Hepatology.

[19]  John J. Grefenstette,et al.  Optimization of Control Parameters for Genetic Algorithms , 1986, IEEE Transactions on Systems, Man, and Cybernetics.

[20]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[21]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[22]  R. Pugh,et al.  Transection of the oesophagus for bleeding oesophageal varices , 1973, The British journal of surgery.

[23]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[24]  Konrad Lang,et al.  Evaluation of automatic knowledge acquisition techniques in the diagnosis of acute abdominal pain - Acute Abdominal Pain Study Group , 1996, Artif. Intell. Medicine.

[25]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[26]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[27]  Pedro Larrañaga,et al.  Combinatonal Optimization by Learning and Simulation of Bayesian Networks , 2000, UAI.

[28]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[29]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[30]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[31]  D. Shafritz,et al.  Identification of integrated hepatitis B virus DNA sequences in human hepatocellular carcinomas , 1981, Hepatology.

[32]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[33]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[34]  P. Langley Selection of Relevant Features in Machine Learning , 1994 .

[35]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[36]  David B. Fogel,et al.  Evolutionary algorithms in theory and practice , 1997, Complex.

[37]  Kenneth DeJong,et al.  Robust feature selection algorithms , 1993, Proceedings of 1993 IEEE Conference on Tools with Al (TAI-93).

[38]  Nir Friedman,et al.  On the Sample Complexity of Learning Bayesian Networks , 1996, UAI.

[39]  Justin Doak,et al.  An evaluation of feature selection methods and their application to computer security , 1992 .

[40]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[41]  I Inza,et al.  Representing the behaviour of supervised classification learning algorithms by Bayesian networks , 1999, Pattern Recognit. Lett..

[42]  Jerzy Stefanowski,et al.  Feature subset selection for classification of histological images , 1997, Artif. Intell. Medicine.

[43]  J. Krige,et al.  Management of oesophageal varices , 1994, The Lancet.

[44]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[45]  Shumeet Baluja,et al.  A Method for Integrating Genetic Search Based Function Optimization and Competitive Learning , 1994 .

[46]  Constantin F. Aliferis,et al.  An evaluation of machine-learning methods for predicting pneumonia mortality , 1997, Artif. Intell. Medicine.