Web–Based Framework For Breast Cancer Classification

Abstract The aim of this work is to create a web-based system that will assist its users in the cancer diagnosis process by means of automatic classification of cytological images obtained during fine needle aspiration biopsy. This paper contains a description of the study on the quality of the various algorithms used for the segmentation and classification of breast cancer malignancy. The object of the study is to classify the degree of malignancy of breast cancer cases from fine needle aspiration biopsy images into one of the two classes of malignancy, high or intermediate. For that purpose we have compared 3 segmentation methods: k-means, fuzzy c-means and watershed, and based on these segmentations we have constructed a 25–element feature vector. The feature vector was introduced as an input to 8 classifiers and their accuracy was checked. The results show that the highest classification accuracy of 89.02 % was recorded for the multilayer perceptron. Fuzzy c–means proved to be the most accurate segmentation algorithm, but at the same time it is the most computationally intensive among the three studied segmentation methods.

[1]  Olvi L. Mangasarian,et al.  Nuclear feature extraction for breast tumor diagnosis , 1993, Electronic Imaging.

[2]  Gerald Schaefer,et al.  A hybrid classifier committee for analysing asymmetry features in breast thermograms , 2014, Appl. Soft Comput..

[3]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[4]  S Issac Niwas,et al.  Wavelet based feature extraction method for breast cancer cytology images , 2010, 2010 IEEE Symposium on Industrial Electronics and Applications (ISIEA).

[5]  Hala H. Zayed,et al.  Remote Computer-Aided Breast Cancer Detection and Diagnosis System Based on Cytological Images , 2014, IEEE Systems Journal.

[6]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[7]  Adam Krzyzak,et al.  Oversampling Methods for Classification of Imbalanced Breast Cancer Malignancy Data , 2012, ICCVG.

[8]  Data, documentation, and decision tables , 1966, CACM.

[9]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[10]  Soo-Hong Kim,et al.  Analysis of breast cancer using data mining & statistical techniques , 2005, Sixth International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing and First ACIS International Workshop on Self-Assembling Wireless Network.

[11]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[12]  Bartosz Krawczyk,et al.  Cytological image analysis with firefly nuclei detection and hybrid one-class classification decomposition , 2014, Eng. Appl. Artif. Intell..

[13]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[14]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[15]  Roman Monczak,et al.  Computer-Aided Breast Cancer Diagnosis Based on the Analysis of Cytological Images of Fine Needle Biopsies , 2013, IEEE Transactions on Medical Imaging.

[16]  Aly A. Farag,et al.  A modified fuzzy c-means algorithm for bias field estimation and segmentation of MRI data , 2002, IEEE Transactions on Medical Imaging.

[17]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[18]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[19]  Jacques Ferlay,et al.  GLOBOCAN 2012 v1.0, Cancer Incidence and Mortality Worldwide: IARC Cancer Base No. 11 [Internet] , 2013 .

[20]  Marek Kowal,et al.  Fuzzy Clustering and Adaptive Thresholding Based Segmentation Method for Breast Cancer Diagnosis , 2011, Computer Recognition Systems 4.

[21]  Jos B. T. M. Roerdink,et al.  The Watershed Transform: Definitions, Algorithms and Parallelization Strategies , 2000, Fundam. Informaticae.

[22]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  O. Mangasarian,et al.  Multisurface method of pattern separation for medical diagnosis applied to breast cytology. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[24]  H. Bloom,et al.  Histological Grading and Prognosis in Breast Cancer , 1957, British Journal of Cancer.

[25]  R. Rivest Learning Decision Lists , 1987, Machine Learning.

[26]  O. Mangasarian,et al.  Pattern Recognition Via Linear Programming: Theory and Application to Medical Diagnosis , 1989 .

[27]  Rached Tourki,et al.  Automated Breast Cancer Diagnosis Based on GVF-Snake Segmentation, Wavelet Features Extraction and Fuzzy Classification , 2009, J. Signal Process. Syst..

[28]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[29]  Joel Quintanilla-Domínguez,et al.  WBCD breast cancer database classification applying artificial metaplasticity neural network , 2011, Expert Syst. Appl..

[30]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .