A decision support system based on an ensemble of random forests for improving the management of women with abnormal findings at cervical cancer screening

In most cases, cervical cancer (CxCa) develops due to underestimated abnormalities in the Pap test. Today, there are ancillary molecular biology techniques available that provide important information related to CxCa and the Human Papillomavirus (HPV) natural history, including HPV DNA tests, HPV mRNA tests and immunocytochemistry techniques such as overexpression of p16. These techniques are either highly sensitive or highly specific, however not both at the same time, thus no perfect method is available today. In this paper we present a decision support system (DSS) based on an ensemble of Random Forests (RFs) for the intelligent combination of the results of classic and ancillary techniques that are available for CxCa detection, in order to exploit the benefits of each technique and produce more accurate results. The proposed system achieved both, high sensitivity (86.1%) and high specificity (93.3%), as well as high overall accuracy (91.8%), in detecting cervical intraepithelial neoplasia grade 2 or worse (CIN2+). The system's performance was better than any other single test involved in this study. Moreover, the proposed architecture of employing an ensemble of RFs proved to be better than the single classifier approach. The presented system can handle cases with missing tests and more importantly cases with inadequate cytological outcome, thus it can also produce accurate results in the case of stand-alone HPV-based screening, where Pap test is not applied. The proposed system may identify women at true risk of developing CxCa and guide personalised management and therapeutic interventions.

[1]  J. Val-Bernal,et al.  A type‐specific study of human papillomavirus prevalence in cervicovaginal samples in three different Spanish regions , 2009, APMIS : acta pathologica, microbiologica, et immunologica Scandinavica.

[2]  L. Paszat,et al.  HPV testing in primary cervical screening: a systematic review and meta-analysis. , 2012, Journal of obstetrics and gynaecology Canada : JOGC = Journal d'obstetrique et gynecologie du Canada : JOGC.

[3]  A. Spathis,et al.  Identification of Women for Referral to Colposcopy by Neural Networks: A Preliminary Study Based on LBC and Molecular Biomarkers , 2012, Journal of biomedicine & biotechnology.

[4]  W. Prendiville,et al.  Perinatal mortality and other severe adverse pregnancy outcomes associated with treatment of cervical intraepithelial neoplasia: meta-analysis , 2008, BMJ : British Medical Journal.

[5]  C. Iavazzo,et al.  Management of ASCUS findings in Papanicolaou smears. A retrospective study. , 2012, European journal of gynaecological oncology.

[6]  M. Henry The Bethesda System 2001: an update of new terminology for gynecologic cytology. , 2003, Clinics in laboratory medicine.

[7]  Petros Karakitsos,et al.  A preliminary study of the potential of tree classifiers in triage of high-grade squamous intraepithelial lesions. , 2011, Analytical and quantitative cytology and histology.

[8]  A. Bianco,et al.  Is HPV DNA testing specificity comparable to that of cytological testing in primary cervical cancer screening? Results of a meta‐analysis of randomized controlled trials , 2014, International journal of cancer.

[9]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[10]  M. von Knebel Doeberitz,et al.  p16INK4a Immunohistochemistry Improves Interobserver Agreement in the Diagnosis of Cervical Intraepithelial Neoplasia , 2002, The American journal of surgical pathology.

[11]  Panagiotis Bountris,et al.  An Intelligent Clinical Decision Support System for Patient-Specific Predictions to Improve Cervical Intraepithelial Neoplasia Detection , 2014, BioMed research international.

[12]  L. Sherr,et al.  Anxiety levels in women attending colposcopy clinics for treatment for cervical intraepithelial neoplasia: a randomised trial of written and video information , 2001, BJOG : an international journal of obstetrics and gynaecology.

[13]  Charalambos Tsirmpas,et al.  Bayesian networks to support the management of patients with ASCUS/LSIL pap tests , 2014, 2014 4th International Conference on Wireless Mobile Communication and Healthcare - Transforming Healthcare Through Innovations in Mobile and Wireless Technologies (MOBIHEALTH).

[14]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[15]  A. Jemal,et al.  Global Cancer Statistics , 2011 .

[16]  Joakim Dillner,et al.  Overview of human papillomavirus-based and other novel options for cervical cancer screening in developed and developing countries. , 2008, Vaccine.

[17]  A. Lie,et al.  Performance of Human Papillomavirus DNA and mRNA Testing Strategies for Women with and without Cervical Neoplasia , 2009, Journal of Clinical Microbiology.

[18]  B. van Gemen,et al.  Application of the NASBA nucleic acid amplification method for the detection of human papillomavirus type 16 E6-E7 transcripts. , 1995, Journal of virological methods.

[19]  Ioannis A Tamposis,et al.  HPVGuard: A software platform to support management and prognosis of cervical cancer , 2014, 2014 4th International Conference on Wireless Mobile Communication and Healthcare - Transforming Healthcare Through Innovations in Mobile and Wireless Technologies (MOBIHEALTH).

[20]  J. Dungan Efficacy of HPV DNA Testing With Cytology Triage and/or Repeat HPV DNA Testing in Primary Cervical Cancer Screening , 2009 .

[21]  Chao Chen,et al.  Using Random Forest to Learn Imbalanced Data , 2004 .

[22]  B. Patterson,et al.  High-throughput cervical cancer screening using intracellular human papillomavirus E6 and E7 mRNA quantification by flow cytometry. , 2005, American journal of clinical pathology.

[23]  Tassos Tagaris,et al.  CxCaDSS: A Web-Based Clinical Decision Support System for Cervical Cancer , 2015 .

[24]  C. Chrelias,et al.  The Application of Classification and Regression Trees for the Triage of Women for Referral to Colposcopy and the Estimation of Risk for Cervical Intraepithelial Neoplasia: A Study Based on 1625 Cases with Incomplete Data from Molecular Tests , 2015, BioMed research international.

[25]  A. Spathis,et al.  Clinical performance of human papillomavirus E6, E7 mRNA flow cytometric assay compared to human papillomavirus DNA typing. , 2011, Analytical and quantitative cytology and histology.