Prognostic comparison of statistical, neural and fuzzy methods of analysis of breast cancer image cytometric data

Aims to predict a breast cancer patient's prognosis and to determine the most important prognostic factors by means of logistic regression (LR) as a conventional statistical method, multilayer backpropagation neural network (MLBPNN) as a neural network method, fuzzy K-nearest neighbour algorithm (FK-NN) as a fuzzy logic method, a fuzzy measurement based on the FK-NN and the leave-one-out error method. The data used for breast cancer prognostic prediction were collected from 100 women who were clinically diagnosed with breast disease in the form of carcinoma or benign conditions. The data set consists of 7 image cytometric prognostic factors and 2 corresponding outputs to be predicted: whether the patient is alive or dead within 5 years of diagnosis. The LR stratified a 5-factor subset with a prognostic predictive accuracy of 82%, while the highest predictive accuracy of the MLBPNN was 87% obtained from two subsets. In this study, the FK-NN yielded the highest predictive accuracy of 88% achieved by eight different subsets, of which the subset with the highest fuzzy measurement was {tumour histology, DNA ploidy, SPF, G/sub 0/G/sub 1//G/sub 2/M ratio}. Although the three methods resulted in different models, the results suggest that tumour histology, DNA ploidy and SPF (S-phase fraction), which are included in all three methods, may be the most significant factors for achieving accurate and reliable breast cancer prognostic prediction.

[1]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[2]  James M. Keller,et al.  A fuzzy K-nearest neighbor algorithm , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[3]  Huseyin Seker,et al.  A soft measurement technique for searching significant subsets of prostate cancer prognostic markers , 2000 .

[4]  Bernard D. Flury,et al.  Why Multivariate Statistics , 1997 .

[5]  Huseyin Seker,et al.  Ranking prostate cancer prognostic markers using a fuzzy K-nearest neighbor algorithm , 2000 .

[6]  Simon Haykin,et al.  Neural networks , 1994 .

[7]  Huseyin Seker,et al.  A fuzzy measurement-based assessment of breast cancer prognostic markers , 2000, Proceedings 2000 IEEE EMBS International Conference on Information Technology Applications in Biomedicine. ITAB-ITIS 2000. Joint Meeting Third IEEE EMBS International Conference on Information Technol.

[8]  E Biganzoli,et al.  Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach. , 1998, Statistics in medicine.

[9]  Raouf N. Gorgui-Naguib,et al.  DNA ploidy and cell cycle distribution of breast cancer aspirate cells measured by image cytometry and analyzed by artificial neural networks for their prognostic significance , 1999, IEEE Transactions on Information Technology in Biomedicine.

[10]  D. E. Neal,et al.  Neural network analysis of combined conventional and experimental prognostic markers in prostate cancer: a pilot study. , 1998, British Journal of Cancer.

[11]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .