A colon cancer grade prediction model using texture and statistical features, SMOTE and mRMR

Colon cancer is usually diagnosed by the visual examination of histopathological tissue specimens using microscope. Histopathologists not only identify the healthy or malignant tissue under microscope, but also determine the quantitative grade of cancer in malignant subjects. This manual process is slow, laborious and affected by the experience and work-load of the histopathologist. Therefore, researchers are working to develop automatic colon cancer grade prediction systems, which can provide a supporting second opinion to the histopathologists. This work presents a computer-aided diagnostic system utilizing Haralick texture features and statistical moments of intensity histogram based features for prediction of colon cancer grades from biopsy images. Further, Synthetic Minority Oversampling Technique (SMOTE) is applied to create a balanced dataset in terms of equal representation of all the classes. The discerning features are selected from the oversampled feature space by using minimum Redundancy Maximum Relevance (mRMR) feature selection methodology. Radial basis function (RBF) kernel of support vector machine (SVM) is applied for classification of samples into three grades of colon cancer. The texture and statistical features coupled with data balancing and feature selection enables the SVM classifier to yield good performance. The individual effect of SMOTE and mRMR on performance enhancement has also been analyzed in detail. The proposed cancer grade prediction system yields better performance compared to existing techniques on a colon cancer grading dataset in terms of various performance measures such as accuracy, sensitivity, specificity, receiver operating characteristics (ROC) curves, and Kappa statistics.

[1]  Asifullah Khan,et al.  GECC: Gene Expression Based Ensemble Classification of Colon Samples , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[2]  Asifullah Khan,et al.  Automated colon cancer detection using hybrid of novel geometric features and some traditional features , 2015, Comput. Biol. Medicine.

[3]  Abdul Jalil,et al.  Novel structural descriptors for automated colon cancer detection and grading , 2015, Comput. Methods Programs Biomed..

[4]  Abdul Jalil,et al.  Classification of colon biopsy images based on novel structural features , 2013, 2013 IEEE 9th International Conference on Emerging Technologies (ICET).

[5]  Saima Rathore,et al.  A novel approach for automatic gene selection and classification of gene based colon cancer datasets , 2014, 2014 International Conference on Emerging Technologies (ICET).

[6]  Abdul Jalil,et al.  Ensemble classification of colon biopsy images based on information rich hybrid features , 2014, Comput. Biol. Medicine.

[7]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[8]  Cenk Sokmensuer,et al.  Color Graphs for Automated Cancer Diagnosis and Grading , 2010, IEEE Transactions on Biomedical Engineering.

[9]  Muchenxuan Tong,et al.  An ensemble of SVM classifiers based on gene pairs , 2013, Comput. Biol. Medicine.

[10]  Ahmad Ali,et al.  A Recent Survey on Colon Cancer Detection Techniques , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[11]  Cigdem Demir,et al.  A Hybrid Classification Model for Digital Pathology Using Structural and Statistical Pattern Recognition , 2013, IEEE Transactions on Medical Imaging.

[12]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[13]  M.,et al.  Statistical and Structural Approaches to Texture , 2022 .

[14]  Cenk Sokmensuer,et al.  Object-oriented texture analysis for the unsupervised segmentation of biopsy images for cancer detection , 2009, Pattern Recognit..

[15]  M F Dixon,et al.  Observer variation in the histological grading of rectal carcinoma. , 1983, Journal of clinical pathology.

[16]  Bayan S. Sharif,et al.  Microscopic image analysis for quantitative measurement and feature identification of normal and cancerous colonic mucosa , 1998, IEEE Transactions on Information Technology in Biomedicine.

[17]  M. Hussain,et al.  A novel approach for colon biopsy image segmentation , 2013, 2013 ICME International Conference on Complex Medical Engineering.

[18]  Haishan Zeng,et al.  Laser-induced autofluorescence microscopy of normal and tumor human colonic tissue. , 2004, International journal of oncology.

[19]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Abdul Jalil,et al.  A novel approach for ensemble clustering of colon biopsy images , 2013, 2013 11th International Conference on Frontiers of Information Technology.

[21]  Nasir M. Rajpoot,et al.  Texture based classification of hyperspectral colon biopsy samples using CLBP , 2009, 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro.