Evaluation of predictive learners for cancer incidence and mortality

Ability to project cancer incidences and mortality is very important for cancer research and healthcare planning and is also a vital part of cancer screening and management programs. In current years, machine learning algorithms have been successfully shown to generate high forecasting accuracy and have drawn interest from healthcare professionals, research community, planners and policy makers. In this paper, three supervised machine learning classification techniques are compared to project cancer incidence and mortality rates. Classification methods covered in this work are Bayesian Network, Naïve Bayes and K-Nearest Neighbor. These classification methods have been tested on the datasets provided by Statistics Canada. This paper evaluates the performance of above classification techniques that examine the accuracy of each method via the utilization of prediction accuracy measures including Mean Absolute Errors, Root Mean Squared Errors, Relative Absolute Error, Precision, ROC area, TP and FP Rate.

[1]  I. Bray,et al.  Bayesian projections: what are the effects of excluding data from younger age groups? , 2005, American journal of epidemiology.

[2]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[3]  Gursel Serpen,et al.  Fast Preliminary Evaluation of New Machine Learning Algorithms for Feasibility , 2010, 2010 Second International Conference on Machine Learning and Computing.

[4]  B Møller,et al.  The future burden of cancer in England: incidence and numbers of new patients in 2020 , 2007, British Journal of Cancer.

[5]  Trevor Darrell,et al.  Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing) , 2006 .

[6]  H. Akaike A new look at the statistical model identification , 1974 .

[7]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[8]  J. Estève,et al.  Projecting cancer incidence and mortality using Bayesian age-period-cohort models. , 2001, Journal of epidemiology and biostatistics.

[9]  F. Bray,et al.  Predicting the future burden of cancer , 2006, Nature Reviews Cancer.

[10]  Nils J. Nilsson,et al.  Introduction to Machine Learning , 2020, Machine Learning for iOS Developers.

[11]  F. Harrell,et al.  Artificial neural networks improve the accuracy of cancer survival prediction , 1997, Cancer.

[12]  C. Mathers,et al.  Projections of Global Mortality and Burden of Disease from 2002 to 2030 , 2006, PLoS medicine.

[13]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[14]  D. Wolpert,et al.  No Free Lunch Theorems for Search , 1995 .

[15]  Bjørn Møller,et al.  Prediction of cancer incidence in the Nordic countries up to the year 2020. , 2002, European journal of cancer prevention : the official journal of the European Cancer Prevention Organisation.

[16]  C Osmond,et al.  Using age, period and cohort models to estimate future mortality rates. , 1985, International journal of epidemiology.

[17]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..