In this paper we investigate applying SOM (Self-Organizing Maps) for classification and rule extraction in data sets with missing values, in particular from real clinical data of bladder cancer patients. For this experiment, we used real data of bladder cancer patients provided by Kitasato University Hospital. When using input data with missing values for SOM, the missing value is either interpolated in the preprocessing stage, or the missing value is replaced with a specific value or property that marks it as a missing value. In either case, there is a possibility some rules can be extracted from data with missing values. On the other hand, these data can have a negative influence for the classification for data sets for which missing values should be neglected. In this research we propose a method where SOM is trained using an input vector in which the properties for the missing values are excluded. The influence of information on the missing values can be reduced by using the proposed method. Through computer simulation, we showed that the proposed method gave good results in classification and rule extraction from clinical data of bladder cancer patients.
[1]
Tariq Samad,et al.
Self–organization with partial data
,
1992
.
[2]
W. See.
Postoperative nomogram predicting risk of recurrence after radical cystectomy for bladder cancer
,
2007
.
[3]
Teuvo Kohonen,et al.
Self-Organizing Maps
,
2010
.
[4]
Yair Lotan,et al.
Nomograms Provide Improved Accuracy for Predicting Survival after Radical Cystectomy
,
2006,
Clinical Cancer Research.
[5]
Y. Lotan,et al.
Precystectomy nomogram for prediction of advanced bladder cancer stage.
,
2006,
European urology.