Breast cancer is a complex and heterogeneous disease due to its diverse morphological features, as well as different clinical outcome. As a result, breast cancer patients may response to different therapeutic options. Currently, difficulties in recognizing the breast cancer types lead to inefficient treatments. Generally, there are two types of breast cancer, known as malignant and benign. Therefore it is necessary to devise a clinically meaningful classification of the disease that can accurately classify breast cancer tissues into relevant classes. This study aims to classify breast cancer lesions which have been obtained from fine needle aspiration (FNA) procedure using random forest. Random forest is a classifier built based on the combination of decision trees and has been identified to perform well in comparison to other machine learning techniques. This method has been tested on approximately 700 data, which consists of 458 instances from benign cases and 241 instances belong to malignant cases. The performance of proposed method is measured based on sensitivity, specificity and accuracy. The experimental results show that, random forest achieved sensitivity of 75%, specificity of 70% and accuracy about 72%. Thus, it can be concluded that random forest can accurately classify breast cancer types given a small number of features and it works as a promising tool to differentiate malignant from benign tumor at early stage.
[1]
Antanas Verikas,et al.
Mining data with random forests: A survey and results of new tests
,
2011,
Pattern Recognit..
[2]
Leo Breiman,et al.
Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author)
,
2001,
Statistical Science.
[3]
I. Bièche,et al.
Genetic alterations in breast cancer
,
1995,
Genes, chromosomes & cancer.
[4]
Kellie J. Archer,et al.
Empirical characterization of random forest variable importance measures
,
2008,
Comput. Stat. Data Anal..
[5]
Kurt Hornik,et al.
The support vector machine under test
,
2003,
Neurocomputing.
[6]
David S. Wishart,et al.
Applications of Machine Learning in Cancer Prediction and Prognosis
,
2006,
Cancer informatics.
[7]
Leo Breiman,et al.
Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author)
,
2001
.
[8]
B. Stewart,et al.
World Cancer Report
,
2003
.