GMDH-based feature ranking and selection for improved classification of medical data

Medical applications are often characterized by a large number of disease markers and a relatively small number of data records. We demonstrate that complete feature ranking followed by selection can lead to appreciable reductions in data dimensionality, with significant improvements in the implementation and performance of classifiers for medical diagnosis. We describe a novel approach for ranking all features according to their predictive quality using properties unique to learning algorithms based on the group method of data handling (GMDH). An abductive network training algorithm is repeatedly used to select groups of optimum predictors from the feature set at gradually increasing levels of model complexity specified by the user. Groups selected earlier are better predictors. The process is then repeated to rank features within individual groups. The resulting full feature ranking can be used to determine the optimum feature subset by starting at the top of the list and progressively including more features until the classification error rate on an out-of-sample evaluation set starts to increase due to overfitting. The approach is demonstrated on two medical diagnosis datasets (breast cancer and heart disease) and comparisons are made with other feature ranking and selection methods. Receiver operating characteristics (ROC) analysis is used to compare classifier performance. At default model complexity, dimensionality reduction of 22 and 54% could be achieved for the breast cancer and heart disease data, respectively, leading to improvements in the overall classification performance. For both datasets, considerable dimensionality reduction introduced no significant reduction in the area under the ROC curve. GMDH-based feature selection results have also proved effective with neural network classifiers.

[1]  R. E. Abdel-Aal,et al.  Improved classification of medical data using abductive network committees trained on different feature subsets , 2005, Comput. Methods Programs Biomed..

[2]  David W. Aha,et al.  A Comparative Evaluation of Sequential Feature Selection Algorithms , 1995, AISTATS.

[3]  D. I. Lewin Getting clinical about neural networks , 2000 .

[4]  Konstantina S. Nikita,et al.  A computer-aided diagnostic system to characterize CT focal liver lesions: design and optimization of a neural network classifier , 2003, IEEE Transactions on Information Technology in Biomedicine.

[5]  H. K. Huang,et al.  Feature selection in the pattern classification problem of digital chest radiograph segmentation , 1995, IEEE Trans. Medical Imaging.

[6]  Wlodzislaw Duch,et al.  A new methodology of extraction, optimization and application of crisp and fuzzy logical rules , 2001, IEEE Trans. Neural Networks.

[7]  Jacek M. Zurada,et al.  GMDH-type neural networks and their application to the medical image recognition of the lungs , 1999, SICE '99. Proceedings of the 38th SICE Annual Conference. International Session Papers (IEEE Cat. No.99TH8456).

[8]  King-Sun Fu,et al.  Handbook of pattern recognition and image processing , 1986 .

[9]  O. Mangasarian,et al.  Multisurface method of pattern separation for medical diagnosis applied to breast cytology. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[10]  J. Echauz,et al.  Neural network detection of antiepileptic drugs from a single EEG trace , 1994, Proceedings of ELECTRO '94.

[11]  R. Abdel-Aal,et al.  Abductive Machine Learning for Modeling and Predicting the Educational Score in School Health Surveys , 1996, Methods of Information in Medicine.

[12]  J. Hanley,et al.  A method of comparing the areas under receiver operating characteristic curves derived from the same cases. , 1983, Radiology.

[13]  Paul Scheunders,et al.  Genetic feature selection combined with composite fuzzy nearest neighbor classifiers for high-dimensional remote sensing data , 2000, Smc 2000 conference proceedings. 2000 ieee international conference on systems, man and cybernetics. 'cybernetics evolving to systems, humans, organizations, and their complex interactions' (cat. no.0.

[14]  Yukio Kosugi,et al.  Image segmentation by neural-net classifiers with genetic selection of feature indices , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[15]  Matthew A. Kupinski,et al.  Feature selection and classifiers for the computerized detection of mass lesions in digital mammography , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[16]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[17]  J. Kittler Feature selection and extraction , 1978 .

[18]  Lipika Dey,et al.  A feature selection technique for classificatory analysis , 2005, Pattern Recognit. Lett..

[19]  D. W. Abbott A two-stage approach to feature downselection for pattern recognition , 1995, 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century.

[20]  Ron Kohavi,et al.  Wrappers for feature selection , 1997 .

[21]  R. Detrano,et al.  International application of a new probability algorithm for the diagnosis of coronary artery disease. , 1989, The American journal of cardiology.

[22]  A. S. Rodionov,et al.  Comparison of linear, nonlinear and feature selection methods for EEG signal classification , 2004, International Conference on Actual Problems of Electron Devices Engineering, 2004. APEDE 2004..

[23]  Conor Heneghan,et al.  Automated processing of the single-lead electrocardiogram for the detection of obstructive sleep apnoea , 2003, IEEE Transactions on Biomedical Engineering.

[24]  R. Abdel-Aal,et al.  Modeling obesity using abductive networks. , 1997, Computers and biomedical research, an international journal.

[25]  Stanley J. Farlow,et al.  Self-Organizing Methods in Modeling: Gmdh Type Algorithms , 1984 .

[26]  Antanas Verikas,et al.  Feature selection with neural networks , 2002, Pattern Recognit. Lett..

[27]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[28]  Andrzej Skowron,et al.  Rough set methods in feature selection and recognition , 2003, Pattern Recognit. Lett..

[29]  Mostefa Mesbah,et al.  An optimal feature set for seizure detection systems for newborn EEG signals , 2003, Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03..

[30]  Walter J. Rawls,et al.  Accuracy and reliability of pedotransfer functions as affected by grouping soils , 1999 .

[31]  Waleed H. Abdulla,et al.  Reduced feature-set based parallel CHMM speech recognition systems , 2003, Inf. Sci..

[32]  Daniel E. O'Leary Data mining and more , 2000, IEEE Intell. Syst..

[33]  Yung-Chang Chen,et al.  Ultrasonic Liver Tissues Classification by Fractal Feature Vector Based on M-band Wavelet Transform , 2001, IEEE Trans. Medical Imaging.

[34]  Aboul Ella Hassanien,et al.  Rough Set Approach for Generation of Classification Rules of Breast Cancer Data , 2004, Informatica.

[35]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[36]  Donald E. Brown,et al.  Fast generic selection of features for neural network classifiers , 1992, IEEE Trans. Neural Networks.

[37]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[38]  Alan F. Murray,et al.  IEEE International Conference on Neural Networks , 1997 .

[39]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[40]  Kristin P. Bennett,et al.  Feature selection for in-silico drug design using genetic algorithms and neural networks , 2001, SMCia/01. Proceedings of the 2001 IEEE Mountain Workshop on Soft Computing in Industrial Applications (Cat. No.01EX504).

[41]  A.J. Hoffman,et al.  Seismic buffer recognition using mutual information for selecting wavelet based features , 1998, IEEE International Symposium on Industrial Electronics. Proceedings. ISIE'98 (Cat. No.98TH8357).

[42]  David W. Opitz,et al.  An empirical evaluation of bagging and boosting for artificial neural networks , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[43]  Wlodzislaw Duch,et al.  Search and global minimization in similarity-based methods , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[44]  Keith C. Drake,et al.  Abductive networks , 1990, Defense, Security, and Sensing.