An improved feature extraction approach based on Rough Sets for the medical diagnosis

This paper presents a novel approach based on rough sets to extract the complicated features from the medical diagnosis corpus. Some symptoms or basic features in the medical diagnosis are usually correlated. In general, the combinations of several basic symptoms may represent the disease more precision. However, the overmuch feature can reduce the generalization ability, or even many unfit features as the noise can decrease the modelpsilas performance. This paper proposes to apply the rough set theory to mine the complicated features, even from noise or inconsistent corpus. Secondly, these complex features are added into the maximum entropy model or support vector machine etc. as a new kind of features, consequently, the feature weights can be assigned according to the performance of the whole model. The experiments in the liver-disorders repository show that our method can improve the maximum entropy model by the precision 3.51%, improve the support vector machine model by the precision 3.05%, improve the naive Bayes model by the precision 3.59%, and improve the Bayes and GoodTuring model by the precision 3.59%.

[1]  Liang Guo-hua Applying rough sets in word segmentation disambiguation based on maximum entropy model , 2006 .

[2]  Jerzy W. Grzymala-Busse,et al.  Machine learning for an expert system to predict preterm birth risk. , 1994, Journal of the American Medical Informatics Association : JAMIA.

[3]  Krzysztof Slowinski,et al.  Rough Classification of HSV Patients , 1992, Intelligent Decision Support.

[4]  Kimbroe J. Carter,et al.  Sequential Test Selection in the Analysis of Abdominal Pain , 1996, Medical decision making : an international journal of the Society for Medical Decision Making.

[5]  Joel Tsevat,et al.  Finding the Optimal Multiple-test Strategy Using a Method Analogous to Logistic Regression , 1996, Medical decision making : an international journal of the Society for Medical Decision Making.

[6]  Xiaolong Wang,et al.  Mining Pinyin-to-character conversion rules from large-scale corpus: a rough set approach , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[7]  Dimitrios K. Lymberopoulos,et al.  A new concept toward computer-aided medical diagnosis - a prototype implementation addressing pulmonary diseases , 2001, IEEE Transactions on Information Technology in Biomedicine.

[8]  Vili Podgorelec,et al.  Towards More Optimal Medical Diagnosing with Evolutionary Algorithms , 2001, Journal of Medical Systems.

[9]  姜维,et al.  Applying rough sets in word segmentation disambiguation based on maximum entropy model , 2006 .

[10]  Alan L. Rector,et al.  MEDICAL INFORMATICS , 1990, The Lancet.

[11]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[12]  A Wakulicz-Deja,et al.  Diagnose progressive encephalopathy applying the rough set theory. , 1997, International journal of medical informatics.