Cross Validation Evaluation for Breast Cancer Prediction Using Multilayer Perceptron Neural Networks

Problem statement: The presence of metastasis in the regional lymph nodes is the most important factor in predicting prognosis in breast cancer. Many biomarkers have been identified that appear to relate to the aggressive behaviour of cancer. However, the nonlinear relation of these markers to nodal status and also the existence of complex interaction between markers have prohibited an accurate prognosis. Approach: The aim of this study is to investigate the effectiveness of a Multilayer Perceptron (MLP) for predicting breast cancer progression using a set of four biomarkers of breast tumors. The biomarkers include DNA ploidy, cell cycle distribution (G0G1/G2M), steroid receptors (ER/PR) and S-Phase Fraction (SPF). A further objective of the study is to explore the predictive potential of these markers in defining the state of nodal involvement in breast cancer. Two methods of outcome evaluation viz. stratified and simple k-fold Cross Validation (CV) are studied in order to assess their accuracy and reliability for neural network validation. Criteria such as output accuracy, sensitivity and specificity are used for selecting the best validation technique besides evaluating the network outcome for different combinations of markers. Results: The results show that stratified 2-fold CV is more accurate and reliable compared to simple k-fold CV as it obtains a higher accuracy and specificity and also provides a more stable network validation in terms of sensitivity. Best prediction results are obtained by using an individual marker-SPF which obtains an accuracy of 65%. Conclusion/Recommendations: Our findings suggest that MLP-based analysis provides an accurate and reliable platform for breast cancer prediction given that an appropriate design and validation method is employed.

[1]  Koji Ueno,et al.  Aneuploidy Predicts Outcome in Patients with Endometrial Carcinoma and Is Related to Lack of CDH13 Hypermethylation , 2008, Clinical Cancer Research.

[2]  A E Giuliano,et al.  Sentinel lymphadenectomy in breast cancer. , 1997, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[3]  E. Winer,et al.  American Society of Clinical Oncology guideline recommendations for sentinel lymph node biopsy in early-stage breast cancer. , 2005, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[4]  T Koivula,et al.  Improving the prognostic value of DNA flow cytometry in breast cancer by combining DNA index and S‐phase fraction: A proposed classification of DNA histograms in breast cancer , 1988, Cancer.

[5]  Donna L. Hudson,et al.  Neural networks and artificial intelligence for biomedical engineering , 1999 .

[6]  J. Concato,et al.  The Risk of Determining Risk with Multivariable Models , 1993, Annals of Internal Medicine.

[7]  Simon Haykin,et al.  Neural Networks and Learning Machines , 2010 .

[8]  J Azúa,et al.  Prognostic value from DNA quantification by static cytometry in breast cancer. , 1997, Analytical and quantitative cytology and histology.

[9]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[10]  E. Touboul,et al.  Combined flow cytometry determination of S-phase fraction and DNA ploidy is an independent prognostic factor in node-negative invasive breast carcinoma: analysis of a series of 271 patients with stage I and II breast cancer , 2005, Breast Cancer Research and Treatment.

[11]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[12]  Jae Bok Lee,et al.  S-phase Fraction as an Independent Prognostic Factor in Invasive Breast Carcinoma -A Study of Long-term Follow-up , 2007 .

[13]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[14]  Paul King Artificial Neural Networks in Cancer Diagnosis, Prognosis, and Patient Management , 2003 .

[15]  Douglas G. Altman,et al.  Methodological challenges in the evaluation of prognostic factors in breast cancer. , 1998 .

[16]  S. S. Dlay,et al.  Prediction of nodal spread of breast cancer by using artificial neural network-based analyses of S100A4, nm23 and steroid receptor expression , 2004, Clinical & Experimental Metastasis.

[17]  Ben Davidson,et al.  Large Scale Genomic Instability as an Additive Prognostic Marker in Early Prostate Cancer , 2009, Cellular oncology : the official journal of the International Society for Cellular Oncology.

[18]  T. Dale,et al.  Regulation of Msx-1, Msx-2, Bmp-2 and Bmp-4 during foetal and postnatal mammary gland development. , 1996, Development.

[19]  G. Hortobagyi,et al.  Prognostic molecular markers in early breast cancer , 2004, Breast Cancer Research.

[20]  E Touboul,et al.  [Combined flow cytometry determination of S-phase fraction and DNA ploidy is an independent prognostic factor in node-negative invasive breast carcinoma: review of a series of 271 patients with stage I and II breast cancer]. , 2005, Cancer radiotherapie : journal de la Societe francaise de radiotherapie oncologique.

[21]  S S Dlay,et al.  Oestrogen and progesterone receptor expression influences DNA ploidy and the proliferation potential of breast cancer cells. , 2003, Anticancer research.

[22]  Paulo J. G. Lisboa,et al.  Orthogonal search-based rule extraction (OSRE) for trained neural networks: a practical and efficient approach , 2006, IEEE Transactions on Neural Networks.

[23]  M Bracko,et al.  S‐phase fraction determined on fine needle aspirates is an independent prognostic factor in breast cancer – a multivariate study of 770 patients , 2008, Cytopathology.

[24]  Farid E Ahmed,et al.  Molecular Cancer BioMed Central Review , 2005 .

[25]  I. P. Corbett,et al.  Predicting outcome for patients with node negative breast cancer: a comparative study of the value of flow cytometry and cell image analysis for determination of DNA ploidy. , 1992, British Journal of Cancer.

[26]  F. Harrell,et al.  Artificial neural networks improve the accuracy of cancer survival prediction , 1997, Cancer.

[27]  Robert Gray,et al.  High DNA content and prognosis in lymph node positive breast cancer. A case control study by the university of Leiden and ECOG , 2004, Breast Cancer Research and Treatment.

[28]  Martin Fodslette Møller,et al.  A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.

[29]  Sholom M. Weiss,et al.  Small Sample Error Rate Estimation for k-NN Classifiers , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[31]  Harleen Kaur,et al.  Empirical Study on Applications of Data Mining Techniques in Healthcare , 2006 .

[32]  H. Magdelenat,et al.  Correlation of pretreatment proliferative activity of breast cancer with the response to cytotoxic chemotherapy. , 1989, Journal of the National Cancer Institute.

[33]  Douglas G. Altman,et al.  Methodological challenges in the evaluation of prognostic factors in breast cancer , 2004, Breast Cancer Research and Treatment.

[34]  Raouf N. Gorgui-Naguib,et al.  DNA ploidy and cell cycle distribution of breast cancer aspirate cells measured by image cytometry and analyzed by artificial neural networks for their prognostic significance , 1999, IEEE Transactions on Information Technology in Biomedicine.

[35]  G M Clark,et al.  Prediction of relapse or survival in patients with node-negative breast cancer by DNA flow cytometry. , 1989, The New England journal of medicine.

[36]  Philip H. Goodman,et al.  Comparing artificial neural networks to other statistical methods for medical outcome prediction , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[37]  Mohd Yusoff Mashor,et al.  Fine Needle Aspiration Cytology Evaluation for Classifying Breast Cancer Using Artificial Neural Network , 2007 .

[38]  Barak A. Pearlmutter,et al.  Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors , 1992, NIPS 1992.

[39]  W. Vach,et al.  On the misuses of artificial neural networks for prognostic and diagnostic classification in oncology. , 2000, Statistics in medicine.