Development of a supervised machine learning model to predict recurrence of oral tongue squamous cell carcinoma

Objective : Despite diagnostic advancements, development of reliable prognostic systems for assessing risk of cancer recurrence still remains a challenge. In this study, we developed a novel framework to leverage the expansive Surveillance, Epidemiology, and End Results (SEER) database to generate highly representative machine learning prediction models for oral tongue squamous cell carcinoma (OTSCC) cancer recurrence. Materials and Methods : Using our framework, we identified cases of 5- and 10-year OTSCC recurrence from the SEER database. Cases were split into training (80%) and test (20%) sets for model development and testing. Four classification models were trained by using the H2O artificial intelligence platform, whose performances were assessed according to their accuracy, recall, precision, and the area under the curve (AUC) of their receiver operating characteristic (ROC) curves. By evaluating Shapley additive explanations contribution plots, feature importance was studied. Results: Of 130,979 patients, 36,042 (27.5%) were female and the mean (SD) age was 58.2 (13.7) years. The Gradient Boosting Machine model performed the best, achieving 81.8% accuracy, 0.75 AUC, 83.0% recall, and 97.7% precision for 5-year prediction. Moreover, 10-year predictions demonstrated 80.0% accuracy, and 94.0% precision. The number of prior tumors, patient age, site of cancer recurrence, and tumor histology were the most significant predictors. Conclusion : Implementation of our novel SEER framework enabled successful identification of patients with OTSCC recurrence, with which highly accurate and sensitive prediction models were generated. Thus, we demonstrated our framework’s potential to be applied to various cancers for building generalizable screening tools to predict tumor recurrence.

[1]  Konstantina S. Nikita,et al.  Interpretability methods of machine learning algorithms with applications in breast cancer diagnosis , 2021, 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC).

[2]  Hang-Seok Chang,et al.  New approach of prediction of recurrence in thyroid cancer patients using machine learning , 2021, Medicine.

[3]  Se-Heon Kim,et al.  Prediction of treatment outcome using MRI radiomics and machine learning in oropharyngeal cancer patients after surgical treatment. , 2021, Oral oncology.

[4]  D. Christiani,et al.  Performance of a Machine Learning Algorithm Using Electronic Health Record Data to Identify and Estimate Survival in a Longitudinal Cohort of Patients With Lung Cancer , 2021, JAMA network open.

[5]  W. Xia,et al.  An interpretable machine learning prognostic system for locoregionally advanced nasopharyngeal carcinoma based on tumor burden features. , 2021, Oral Oncology.

[6]  J. Brody,et al.  A Genetic Risk Score for Glioblastoma Multiforme Based on Copy Number Variations. , 2021, medRxiv.

[7]  M. Ji,et al.  A machine learning-based predictor for the identification of the recurrence of patients with gastric cancer after operation , 2021, Scientific Reports.

[8]  Nikki P Lee,et al.  Machine Learning and Treatment Outcome Prediction for Oral Cancer. , 2020, Journal of oral pathology & medicine : official publication of the International Association of Oral Pathologists and the American Academy of Oral Pathology.

[9]  J. Brody,et al.  Genetic risk score for ovarian cancer based on chromosomal-scale length variation , 2020, medRxiv.

[10]  Frederick M. Howard,et al.  Machine learning guided adjuvant treatment of head and neck cancer. , 2020 .

[11]  Jin Ho Kim,et al.  Increasing incidence and improving survival of oral tongue squamous cell carcinoma , 2020, Scientific Reports.

[12]  Mohammed Elmusrati,et al.  Comparison of supervised machine learning classification techniques in prediction of locoregional recurrences in early oral tongue cancer , 2019, Int. J. Medical Informatics.

[13]  Omar A. Karadaghy,et al.  Development and Assessment of a Machine Learning Model to Help Predict Survival Among Patients With Oral Squamous Cell Carcinoma. , 2019, JAMA otolaryngology-- head & neck surgery.

[14]  Sameer Gupta,et al.  Validation of the Brandwein Gensler Risk Model in Patients of Oral Cavity Squamous Cell Carcinoma in North India , 2019, Head and Neck Pathology.

[15]  Thomas E Heineman,et al.  Oral tongue squamous cell carcinoma survival as stratified by age and sex: A surveillance, epidemiology, and end results analysis , 2018, The Laryngoscope.

[16]  Bernd Bischl,et al.  An Open Source AutoML Benchmark , 2019, ArXiv.

[17]  Simion I. Chiosea,et al.  Measuring Depth of Invasion in Early Squamous Cell Carcinoma of the Oral Tongue: Positive Deep Margin, Extratumoral Perineural Invasion, and Other Challenges , 2019, Head and Neck Pathology.

[18]  Dechang Chen,et al.  Creating Prognostic Systems for Well-Differentiated Thyroid Cancer Using Machine Learning , 2019, Front. Endocrinol..

[19]  Melissa Zhao,et al.  Machine Learning With K-Means Dimensional Reduction for Predicting Survival Outcomes in Patients With Breast Cancer , 2018, Cancer informatics.

[20]  M. Bullock,et al.  The histologic risk model is a useful and inexpensive tool to assess risk of recurrence and death in stage I or II squamous cell carcinoma of tongue and floor of mouth , 2018, Modern Pathology.

[21]  James A. Bartholomai,et al.  Prediction of lung cancer patient survival via supervised machine learning classification techniques , 2017, Int. J. Medical Informatics.

[22]  Joachim E. Zöller,et al.  Analysis of clinicopathological risk factors for locoregional recurrence of oral squamous cell carcinoma - Retrospective analysis of 517 patients. , 2017, Journal of cranio-maxillo-facial surgery : official publication of the European Association for Cranio-Maxillo-Facial Surgery.

[23]  S. Schultze–Mosgau,et al.  Degree of Keratinization Is an Independent Prognostic Factor in Oral Squamous Cell Carcinoma. , 2017, Journal of oral and maxillofacial surgery : official journal of the American Association of Oral and Maxillofacial Surgeons.

[24]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[25]  M C Metzger,et al.  Recurrence rate and shift in histopathological differentiation of oral squamous cell carcinoma – A long-term retrospective study over a period of 13.5 years. , 2015, Journal of cranio-maxillo-facial surgery : official publication of the European Association for Cranio-Maxillo-Facial Surgery.

[26]  S. El-Mofty Histopathologic risk factors in oral and oropharyngeal squamous cell carcinoma variants: An update with special reference to HPV-related carcinomas , 2014, Medicina oral, patologia oral y cirugia bucal.

[27]  Carlos Suárez,et al.  Genetic susceptibility to head and neck squamous cell carcinoma. , 2014, International journal of radiation oncology, biology, physics.

[28]  Mahesh D. Patel,et al.  A prospective study of prognostic factors for recurrence in early oral tongue cancer. , 2013, Journal of clinical and diagnostic research : JCDR.

[29]  Xudong Wang,et al.  The recurrence and survival of oral squamous cell carcinoma: a report of 275 cases , 2013, Chinese journal of cancer.

[30]  R. Chernock Morphologic Features of Conventional Squamous Cell Carcinoma of the Oropharynx: ‘Keratinizing’ and ‘Nonkeratinizing’ Histologic Types as the Basis for a Consistent Classification System , 2012, Head and Neck Pathology.

[31]  Inmaculada Tomás,et al.  Predictors for tumor recurrence after primary definitive surgery for oral cancer. , 2012, Journal of oral and maxillofacial surgery : official journal of the American Association of Oral and Maxillofacial Surgeons.

[32]  Fernando Luiz Dias,et al.  Oral Squamous Cell Carcinoma: Clinicopathological Features in Patients with and without Recurrence , 2011, ORL.

[33]  D. Hayes,et al.  Increasing incidence of oral tongue squamous cell carcinoma in young white women, age 18 to 44 years. , 2011, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[34]  A. Petrie,et al.  Clinicopathological parameters, recurrence, locoregional and distant metastasis in 115 T1-T2 oral squamous cell carcinoma patients , 2010, Head & neck oncology.

[35]  S. Mane,et al.  Predicting neuroendocrine tumor (carcinoid) neoplasia using gene expression profiling and supervised machine learning , 2009, Cancer.

[36]  S. Warnakulasuriya Global epidemiology of oral and oropharyngeal cancer. , 2009, Oral oncology.

[37]  T. Kwon,et al.  Factors Related to Regional Recurrence in Early Stage Squamous Cell Carcinoma of the Oral Tongue , 2008, Clinical and experimental otorhinolaryngology.

[38]  G. Snow,et al.  Role of genetic factors in the etiology of squamous cell carcinoma of the head and neck. , 1995, Archives of otolaryngology--head & neck surgery.

[39]  T. Hadar,et al.  Squamous cell carcinoma of the oral tongue. , 1991, European journal of surgical oncology : the journal of the European Society of Surgical Oncology and the British Association of Surgical Oncology.

[40]  C. Popescu,et al.  Multiple cancers of the head and neck. , 2013, Maedica.

[41]  C. Matthias,et al.  Influential factors on tumor recurrence in head and neck cancer patients , 2005, European Archives of Oto-Rhino-Laryngology and Head & Neck.

[42]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..