A Hierarchical Feature-Based Methodology to Perform Cervical Cancer Classification

Prevention of cervical cancer could be performed using Pap smear image analysis. This test screens pre-neoplastic changes in the cervical epithelial cells; accurate screening can reduce deaths caused by the disease. Pap smear test analysis is exhaustive and repetitive work performed visually by a cytopathologist. This article proposes a workload-reducing algorithm for cervical cancer detection based on analysis of cell nuclei features within Pap smear images. We investigate eight traditional machine learning methods to perform a hierarchical classification. We propose a hierarchical classification methodology for computer-aided screening of cell lesions, which can recommend fields of view from the microscopy image based on the nuclei detection of cervical cells. We evaluate the performance of several algorithms against the Herlev and CRIC databases, using a varying number of classes during image classification. Results indicate that the hierarchical classification performed best when using Random Forest as the key classifier, particularly when compared with decision trees, k-NN, and the Ridge methods.

[1]  Marcone J. F. Souza,et al.  An Iterated Local Search-Based Algorithm to Support Cell Nuclei Detection in Pap Smears Test , 2019, ICEIS.

[2]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[3]  M. Lepe,et al.  Atypical Glandular Cells: Interobserver Variability according to Clinical Management , 2018, Acta Cytologica.

[4]  Sameer Antani,et al.  Synthetic Augmentation and Feature-based Filtering for Improved Cervical Histopathology Image Classification , 2019, MICCAI.

[5]  Ghassan Hamarneh,et al.  Evaluation of Three Algorithms for the Segmentation of Overlapping Cervical Cells , 2017, IEEE Journal of Biomedical and Health Informatics.

[6]  M. Teague Image analysis via the general theory of moments , 1980 .

[7]  R. Nayar,et al.  Bethesda 2014: improving on a paradigm shift , 2015, Cytopathology : official journal of the British Society for Clinical Cytology.

[8]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[9]  D. Davey,et al.  Quality Assurance and Risk Reduction Guidelines , 2000, Acta Cytologica.

[10]  Ling Zhang,et al.  Fine-Grained Classification of Cervical Cells Using Morphological and Appearance Based Convolutional Neural Networks , 2018, IEEE Access.

[11]  Trevor Hastie,et al.  Multi-class AdaBoost ∗ , 2009 .

[12]  A. Jemal,et al.  Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries , 2018, CA: a cancer journal for clinicians.

[13]  A. Jemal,et al.  Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries , 2021, CA: a cancer journal for clinicians.

[14]  Damminda Alahakoon,et al.  Minority report in fraud detection: classification of skewed data , 2004, SKDD.

[15]  Siegfried Kropf,et al.  A Ridge Classification Method for High-dimensional Observations , 2005, GfKl.

[16]  Cecilia Di Ruberto,et al.  Histological Image Analysis by Invariant Descriptors , 2017, ICIAP.

[17]  Na Dong,et al.  Inception v3 based cervical cell classification combined with artificially extracted features , 2020, Appl. Soft Comput..

[18]  Geoffrey E. Hinton Connectionist Learning Procedures , 1989, Artif. Intell..

[19]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[20]  Sirlei Siani Morais,et al.  Fatores associados a resultados falso-negativos de exames citopatológicos do colo uterino , 2006 .

[21]  A Singer,et al.  Report on consensus conference on cervical cancer screening and management , 2000, International journal of cancer.

[22]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[23]  False-Negative Rate of Papanicolaou Testing: A National Survey from the Thai Society of Cytology , 2017, Acta Cytologica.

[24]  Emmanuelle Gouillart,et al.  scikit-image: image processing in Python , 2014, PeerJ.

[25]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[26]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[27]  Eddy Sánchez-Delacruz,et al.  Classification of Cervical Cancer Using Assembled Algorithms in Microscopic Images of Papanicolaou , 2017, Res. Comput. Sci..

[28]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[29]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[30]  Leslie Pérez Cáceres,et al.  The irace package: Iterated racing for automatic algorithm configuration , 2016 .

[31]  Luis Pedro Coelho,et al.  Mahotas: Open source software for scriptable computer vision , 2012, ArXiv.

[32]  Hui Han,et al.  Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.

[33]  Nicholas A. Hamilton,et al.  Fast automated cell phenotype image classification , 2007, BMC Bioinformatics.

[34]  S. Shapiro,et al.  An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[35]  Matti Pietikäinen,et al.  Gray Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2000, ECCV.

[36]  Lipi B. Mahanta,et al.  A comprehensive study on the multi-class cervical cancer diagnostic prediction on pap smear images using a fusion-based decision from ensemble deep convolutional neural network. , 2020, Tissue & cell.

[37]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[38]  Francisco Herrera,et al.  Addressing data complexity for imbalanced data sets: analysis of SMOTE-based oversampling and evolutionary undersampling , 2011, Soft Comput..

[39]  E. Lazcano-Ponce,et al.  Assessment of the Validity and Reproducibility of the Pap Smear in Mexico: Necessity of a Paradigm Shift. , 2015, Archives of medical research.

[40]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[41]  J. A. Ware,et al.  A review of image analysis and machine learning techniques for automated cervical cancer screening from pap-smear images , 2018, Comput. Methods Programs Biomed..

[42]  James Geller,et al.  Data Mining: Practical Machine Learning Tools and Techniques - Book Review , 2002, SIGMOD Rec..

[43]  S. Naryshkin The false-negative fraction for Papanicolaou smears: how often are "abnormal" smears not detected by a "standard" screening cytologist? , 1997, Archives of pathology & laboratory medicine.

[44]  Meenakshi Singh,et al.  A Study on Cervical Cancer Screening Using Pap Smear Test and Clinical Correlation , 2018, Asia-Pacific journal of oncology nursing.

[45]  Flávio H. D. Araújo,et al.  Searching for cell signatures in multidimensional feature spaces , 2021, International Journal of Biomedical Engineering and Technology.

[46]  J. Goellner,et al.  False-negative results in cervical cytologic studies. , 1985, Acta cytologica.

[47]  David Mease,et al.  Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers , 2015, J. Mach. Learn. Res..

[48]  Antoine Pirovano,et al.  Regression Constraint for an Explainable Cervical Cancer Classifier , 2019, ArXiv.

[49]  M. Shamim Hossain,et al.  Cervical cancer classification using convolutional neural networks and extreme learning machines , 2020, Future Gener. Comput. Syst..

[50]  Malay Kumar Kundu,et al.  Automated classification of Pap smear images to detect cervical dysplasia , 2017, Comput. Methods Programs Biomed..

[51]  M. Boon,et al.  Characteristics of false-negative smears tested in the normal screening situation. , 1992, Acta cytologica.

[52]  L. C. B. Cury,et al.  Avaliação crítica das nomenclaturas diagnósticas dos exames citopatológicos cervicais utilizadas no Sistema Único de Saúde (SUS) , 2011 .

[53]  Katsumi Inoue,et al.  Relational Reinforcement Learning for Planning with Exogenous Effects , 2017 .

[54]  Ahmed Ghoneim,et al.  Machine learning for assisting cervical cancer diagnosis: An ensemble approach , 2020, Future Gener. Comput. Syst..