Applications of Machine Learning in Cancer Prediction and Prognosis

Machine learning is a branch of artificial intelligence that employs a variety of statistical, probabilistic and optimization techniques that allows computers to “learn” from past examples and to detect hard-to-discern patterns from large, noisy or complex data sets. This capability is particularly well-suited to medical applications, especially those that depend on complex proteomic and genomic measurements. As a result, machine learning is frequently used in cancer diagnosis and detection. More recently machine learning has been applied to cancer prognosis and prediction. This latter approach is particularly interesting as it is part of a growing trend towards personalized, predictive medicine. In assembling this review we conducted a broad survey of the different types of machine learning methods being used, the types of data being integrated and the performance of these methods in cancer prediction and prognosis. A number of trends are noted, including a growing dependence on protein biomarkers and microarray data, a strong bias towards applications in prostate and breast cancer, and a heavy reliance on “older” technologies such artificial neural networks (ANNs) instead of more recently developed or more easily interpretable machine learning methods. A number of published studies also appear to lack an appropriate level of validation or testing. Among the better designed and validated studies it is clear that machine learning methods can be used to substantially (15–25%) improve the accuracy of predicting cancer susceptibility, recurrence and mortality. At a more fundamental level, it is also evident that machine learning is also helping to improve our basic understanding of cancer development and progression.

[1]  H. Joensuu,et al.  Artificial Neural Networks Applied to Survival Prediction in Breast Cancer , 1999, Oncology.

[2]  E. Crawford,et al.  Impact of different variables on the outcome of patients with clinically confined prostate carcinoma: prediction of pathologic stage and biochemical failure using an artificial neural network. , 2001, Cancer.

[3]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[4]  L. Mariani,et al.  Prognostic factors for metachronous contralateral breast cancer: A comparison of the linear Cox regression model and its artificial neural network extension , 1997, Breast Cancer Research and Treatment.

[5]  Christos Sotiriou,et al.  Bringing molecular prognosis and prediction to the clinic. , 2005, Clinical breast cancer.

[6]  B Angus,et al.  Prediction of nodal metastasis and prognosis in breast cancer: a neural model. , 1997, Anticancer research.

[7]  R W Veltri,et al.  Genetically engineered neural networks for predicting prostate cancer progression after radical prostatectomy. , 1999, Urology.

[8]  Qing-Rong Chen,et al.  Prediction of Clinical Outcome Using Gene Expression Profiling and Artificial Neural Networks for Patients with Neuroblastoma , 2004, Cancer Research.

[9]  Igor Kononenko,et al.  Bayesian neural networks , 1989, Biological Cybernetics.

[10]  David Kerr,et al.  Neural networks in the prediction of survival in patients with colorectal cancer. , 2003, Clinical colorectal cancer.

[11]  Alistair J. Cochran,et al.  Prediction of Outcome for Patients with Cutaneous Melanoma , 1997 .

[12]  Paulo J. G. Lisboa,et al.  A Bayesian neural network approach for modelling censored data with an application to prognosis after surgery for breast cancer , 2003, Artif. Intell. Medicine.

[13]  Taizo Hanai,et al.  Prognostic models in patients with non‐small‐cell lung cancer using artificial neural networks in comparison with logistic regression , 2003, Cancer science.

[14]  Marcel Dettling,et al.  BagBoosting for tumor classification with gene expression data , 2004, Bioinform..

[15]  E. Claus Risk models used to counsel women for breast and ovarian cancer: a guide for clinicians , 2004, Familial Cancer.

[16]  P Ravazoula,et al.  A computer-based diagnostic and prognostic system for assessing urinary bladder tumour grade and predicting cancer recurrence , 2002, Medical informatics and the Internet in medicine.

[17]  G P Murphy,et al.  Use of artificial neural networks in evaluating prognostic factors determining the response to dendritic cells pulsed with PSMA peptides in prostate cancer patients , 2000, The Prostate.

[18]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[19]  David W. Aha,et al.  Tolerating Noisy, Irrelevant and Novel Attributes in Instance-Based Learning Algorithms , 1992, Int. J. Man Mach. Stud..

[20]  Lionel Tarassenko,et al.  Non‐linear survival analysis using neural networks , 2004, Statistics in medicine.

[21]  H. Maeta,et al.  Prediction of the early prognosis of the hepatectomized patient with hepatocellular carcinoma with a neural network. , 1995, Computers in biology and medicine.

[22]  Taizo Hanai,et al.  Fuzzy Neural Network Applied to Gene Expression Profiling for Predicting the Prognosis of Diffuse Large B‐cell Lymphoma , 2002, Japanese journal of cancer research : Gann.

[23]  S. Webb,et al.  Use of artificial neural networks to predict biological outcomes for patients receiving radical radiotherapy of the prostate. , 2004, Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology.

[24]  K. Marx,et al.  Applications of Machine Learning and High‐Dimensional Visualization in Cancer Detection, Diagnosis, and Management , 2004, Annals of the New York Academy of Sciences.

[25]  W S McCulloch,et al.  A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[26]  Randy D Gascoyne,et al.  Molecular Signatures of Lymphoma , 2004, International journal of hematology.

[27]  Rick Grehan Performance comparisons , 1993 .

[28]  Masatsugu Horiuchi,et al.  London Radiation Symposium , 1963, Cell and tissue kinetics.

[29]  Ivan Bratko,et al.  Machine Learning for Survival Analysis: A Case Study on Recurrence of Prostate Cancer , 1999, AIMDM.

[30]  J. Rakela,et al.  Artificial Neural Network and Tissue Genotyping of Hepatocellular Carcinoma in Liver-Transplant Recipients: Prediction of Recurrence , 2005, Transplantation.

[31]  H. A. Kestler,et al.  Prediction of the axillary lymph node status in mammary cancer on the basis of clinicopathological data and flow cytometry , 2004, Medical and Biological Engineering and Computing.

[32]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[33]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[34]  Young Sun Kim,et al.  Screening test data analysis for liver disease prediction model using growth curve. , 2003, Biomedicine & pharmacotherapy = Biomedecine & pharmacotherapie.

[35]  A W Partin,et al.  A neural network predicts progression for men with gleason score 3+4 versus 4+3 tumors after radical prostatectomy. , 2000, Urology.

[36]  William Nick Street,et al.  A Neural Network Model for Prognostic Prediction , 1998, ICML.

[37]  Anthony C Fisher,et al.  Modelling survival after treatment of intraocular melanoma using artificial neural networks and Bayes theorem. , 2004, Physics in medicine and biology.

[38]  R. Dumitrescu,et al.  Understanding breast cancer risk ‐ where do we stand in 2005? , 2005, Journal of cellular and molecular medicine.

[39]  D V Cicchetti,et al.  Neural networks and diagnosis in the clinical laboratory: state of the art. , 1992, Clinical chemistry.

[40]  P M Ravdin,et al.  A prognostic model that makes quantitative estimates of probability of relapse for breast cancer patients. , 1999, Clinical cancer research : an official journal of the American Association for Cancer Research.

[41]  Huseyin Seker,et al.  Assessment of nodal involvement and survival analysis in breast cancer patients using image cytometric data: statistical, neural network and fuzzy approaches. , 2002, Anticancer research.

[42]  Gabriel Capellà,et al.  Genomic determinants of prognosis in colorectal cancer. , 2005, Cancer letters.

[43]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[44]  M. Duffy,et al.  Predictive markers in breast and other cancers: a review. , 2005, Clinical chemistry.

[45]  Hardev Pandha,et al.  Delayed Disease Progression after Allogeneic Cell Vaccination in Hormone-Resistant Prostate Cancer and Correlation with Immunologic Variables , 2005, Clinical Cancer Research.

[46]  Pasko Konjevoda,et al.  Immunohistochemical Analysis and Prognostic Value of Cathepsin D Determination in Laryngeal Squamous Cell Carcinoma , 2000, J. Chem. Inf. Comput. Sci..

[47]  Yudong D. He,et al.  A cell proliferation signature is a marker of extremely poor outcome in a subpopulation of breast cancer patients. , 2005, Cancer research.

[48]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[49]  M F Jefferson,et al.  Comparison of a genetic algorithm neural network with logistic regression for predicting outcome after surgery for patients with nonsmall cell lung carcinoma , 1997, Cancer.

[50]  D. E. Neal,et al.  Neural network analysis of combined conventional and experimental prognostic markers in prostate cancer: a pilot study. , 1998, British Journal of Cancer.

[51]  Edward A. Patrick,et al.  A Generalized k-Nearest Neighbor Rule , 1970, Inf. Control..

[52]  P H Bartels,et al.  Case-based prediction of survival in colorectal cancer patients. , 1999, Analytical and quantitative cytology and histology.

[53]  John Rand,et al.  Using neural networks to diagnose cancer , 1991, Journal of Medical Systems.

[54]  Li Song,et al.  Neural network analysis of lymphoma microarray data: prognosis and diagnosis near-perfect , 2003, BMC Bioinformatics.

[55]  Gustavo Santos-García,et al.  Prediction of postoperative morbidity after lung resection using an artificial neural network ensemble , 2004, Artif. Intell. Medicine.

[56]  M. Kattan Comparison of Cox regression with other methods for determining prediction models and nomograms. , 2003, The Journal of urology.

[57]  Yutaka Shimada,et al.  Prediction of survival in patients with esophageal carcinoma using artificial neural networks , 2005, Cancer.

[58]  Justis P. Ehlers,et al.  NBS1 Expression as a Prognostic Marker in Uveal Melanoma , 2005, Clinical Cancer Research.

[59]  K. Murase,et al.  Survival prediction using artificial neural networks in patients with uterine cervical cancer treated by radiation therapy alone , 2002, International Journal of Clinical Oncology.

[60]  A W Partin,et al.  The use of artificial intelligence technology to predict lymph node spread in men with clinically localized prostate carcinoma , 2000, Cancer.

[61]  L. Freedman,et al.  The future of prognostic factors in outcome prediction for patients with cancer , 1992, Cancer.

[62]  Georg Bartsch,et al.  Model to predict prostate biopsy outcome in large screening population with independent validation in referral setting. , 2005, Urology.

[63]  B. Weber,et al.  Application of breast cancer risk prediction models in clinical practice. , 2003, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[64]  Junshui Ma,et al.  Improved Prediction of Prostate Cancer Recurrence Based on an Automated Tissue Image Analysis System , 2004 .

[65]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[66]  R. Strauss,et al.  Childhood obesity. , 2002, Pediatric clinics of North America.

[67]  Leroy Hood,et al.  Systems biology, proteomics, and the future of health care: toward predictive, preventative, and personalized medicine. , 2004, Journal of proteome research.

[68]  A. Marchevsky,et al.  Estimation of tumor stage and lymph node status in patients with colorectal adenocarcinoma using probabilistic neural networks and logistic regression. , 1999, Modern pathology : an official journal of the United States and Canadian Academy of Pathology, Inc.

[69]  ZhouXiaobo,et al.  Cancer classification and prediction using logistic regression with Bayesian gene selection , 2004 .

[70]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[71]  Nikola Kasabov,et al.  Prediction of clinical behaviour and treatment for cancers. , 2003, Applied bioinformatics.

[72]  Mattfeldt,et al.  Prediction of prostatic cancer progression after radical prostatectomy using artificial neural networks: a feasibility study , 1999, BJU international.

[73]  P. Butow,et al.  Communicating prognosis in cancer care: a systematic review of the literature. , 2005, Annals of oncology : official journal of the European Society for Medical Oncology.

[74]  Caroline Lohrisch,et al.  The Predictive Value of HER2 in Breast Cancer , 2001, Oncology.

[75]  Tomoko Umaki,et al.  Possible Prediction of Chemoradiosensitivity of Esophageal Cancer by Serum Protein Profiling , 2005, Clinical Cancer Research.

[76]  Graham Ball,et al.  A prototype methodology combining surface‐enhanced laser desorption/ionization protein chip technology and artificial neural network algorithms to predict the chemoresponsiveness of breast cancer cell lines exposed to Paclitaxel and Doxorubicin under in vitro conditions , 2003, Proteomics.

[77]  F. Harrell,et al.  Artificial neural networks improve the accuracy of cancer survival prediction , 1997, Cancer.

[78]  P. Snow,et al.  Introduction to artificial neural networks for physicians: Taking the lid off the black box , 2001, The Prostate.

[79]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[80]  Minyou Chen,et al.  Artificial intelligence in predicting bladder cancer outcome: a comparison of neuro-fuzzy modeling and artificial neural networks. , 2003, Clinical cancer research : an official journal of the American Association for Cancer Research.

[81]  Carmen Mocanu,et al.  [Uveal melanoma]. , 2006, Oftalmologia.

[82]  N. Iizuka,et al.  MECHANISMS OF DISEASE Mechanisms of disease , 2022 .

[83]  Pedro Larrañaga,et al.  Predicting survival in malignant skin melanoma using Bayesian networks automatically induced by genetic algorithms. An empirical comparison between different approaches , 1998, Artif. Intell. Medicine.

[84]  H P Leenhouts,et al.  Radon-induced lung cancer in smokers and non-smokers: risk implications using a two-mutation carcinogenesis model , 1999, Radiation and environmental biophysics.

[85]  J. Listgarten,et al.  Predictive Models for Breast Cancer Susceptibility from Multiple Single Nucleotide Polymorphisms , 2004, Clinical Cancer Research.

[86]  E J Gamito,et al.  Genetic adaptive neural network to predict biochemical failure after radical prostatectomy: a multi-institutional study. , 2001, Molecular urology.

[87]  F. Gasco´n,et al.  Childhood obesity and hormonal abnormalities associated with cancer risk , 2004, European journal of cancer prevention : the official journal of the European Cancer Prevention Organisation.

[88]  B. Yegnanarayana,et al.  Artificial Neural Networks , 2004 .

[89]  D M Rodvold,et al.  Neural network and regression predictions of 5‐year survival after colon carcinoma treatment , 2001, Cancer.

[90]  Wei Ji,et al.  Neural network-based assessment of prognostic markers and outcome prediction in bilharziasis-associated bladder cancer , 2003, IEEE Transactions on Information Technology in Biomedicine.

[91]  Yuh-Jye Lee,et al.  Breast cancer survival and chemotherapy: A support vector machine analysis , 1999, Discrete Mathematical Problems with Medical Applications.

[92]  Mark W. Watson,et al.  Forecasting Performance , 1993 .

[93]  Diego Liberati,et al.  Forecasting the performance status of head and neck cancer patient treatment by an interval arithmetic pruned perceptron , 2002, IEEE Transactions on Biomedical Engineering.

[94]  Vassilis Poulakis,et al.  Preoperative neural network using combined magnetic resonance imaging variables, prostate specific antigen, and Gleason score to predict prostate cancer recurrence after radical prostatectomy. , 2004, European urology.

[95]  Yutaka Shimada,et al.  Prediction of Lymph Node Metastasis with Use of Artificial Neural Networks Based on Gene Expression Profiles in Esophageal Squamous Cell Carcinoma , 2004, Annals of Surgical Oncology.

[96]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[97]  T. Guthrie,et al.  Prostate cancer. , 2020, American family physician.

[98]  Rebecca Voelker,et al.  Breast Cancer Survival , 1998 .

[99]  P. Rothwell,et al.  Prognostic models , 2008, Practical Neurology.

[100]  Shu Zheng,et al.  Application of serum protein fingerprinting coupled with artificial neural network model in diagnosis of hepatocellular carcinoma. , 2005, Chinese medical journal.

[101]  José Antonio Gómez-Ruiz,et al.  A combined neural network and decision trees model for prognosis of breast cancer relapse , 2003, Artif. Intell. Medicine.

[102]  M. Duffy,et al.  Biochemical markers in breast cancer: which ones are clinically useful? , 2001, Clinical biochemistry.

[103]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[104]  Hiroyuki Honda,et al.  Multiple fuzzy neural network system for outcome prediction and classification of 220 lymphoma patients on the basis of molecular profiling , 2003, Cancer science.

[105]  Christopher G. Chute,et al.  Cancer Informatics , 2002, Health Informatics.

[106]  C. Begg,et al.  Variations in lung cancer risk among smokers. , 2003, Journal of the National Cancer Institute.

[107]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[108]  Fuu-Jen Tsai,et al.  Prediction of survival in surgical unresectable lung cancer by artificial neural networks including genetic polymorphisms and clinical parameters , 2003, Journal of clinical laboratory analysis.

[109]  Jonathan E. Rowe,et al.  An evolutionary approach to constructing prognostic models , 1999, Artif. Intell. Medicine.

[110]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[111]  R. Schmelzeisen,et al.  Application of fuzzy inference to European patients to predict cervical lymph node metastasis in carcinoma of the tongue. , 2005, International journal of oral and maxillofacial surgery.

[112]  E. Petricoin,et al.  SELDI-TOF-based serum proteomic pattern diagnostics for early detection of cancer. , 2004, Current opinion in biotechnology.

[113]  Moyed Miften,et al.  An artificial neural network for predicting the incidence of radiation pneumonitis. , 2005, Medical physics.

[114]  A M Marchevsky,et al.  Reasoning with uncertainty in pathology: artificial neural networks and logistic regression as tools for prediction of lymph node status in breast cancer patients. , 1999, Modern pathology : an official journal of the United States and Canadian Academy of Pathology, Inc.

[115]  N. Masic,et al.  Decision‐tree approach to the immunophenotype‐based prognosis of the B‐cell chronic lymphocytic leukemia , 1998, American journal of hematology.

[116]  Vassilis Poulakis,et al.  Preoperative neural network using combined magnetic resonance imaging variables, prostate-specific antigen, and gleason score for predicting prostate cancer biochemical recurrence after radical prostatectomy. , 2004, Urology.

[117]  Keiichi Maruyama,et al.  Artificial Neural Network for Prediction of Lymph Node Metastases in Gastric Cancer: a Phase Ii Diagnostic Study , 2003 .

[118]  L. Bottaci,et al.  Artificial neural networks applied to outcome prediction for colorectal cancer patients in separate institutions , 1997, The Lancet.

[119]  S. Baldus,et al.  MUC1 and the MUCs: A Family of Human Mucins with Impact in Cancer Biology , 2004, Critical reviews in clinical laboratory sciences.

[120]  C E Floyd,et al.  Artificial neural network model of survival in patients treated with irradiation with and without concurrent chemotherapy for advanced carcinoma of the head and neck. , 1998, International journal of radiation oncology, biology, physics.

[121]  Robert J. Marks,et al.  Performance Comparisons Between Backpropagation Networks and Classification Trees on Three Real-World Applications , 1989, NIPS.

[122]  Mordechai Rosner,et al.  Forecasting the prognosis of choroidal melanoma with an artificial neural network. , 2005, Ophthalmology.

[123]  Richard Baumgartner,et al.  Class prediction and discovery using gene microarray and proteomics mass spectroscopy data: curses, caveats, cautions , 2003, Bioinform..

[124]  フジャルマー ネカルダ,et al.  Prognosis of colorectal cancer , 2004 .

[125]  G Coppini,et al.  Detection of single and clustered microcalcifications in mammograms using fractals models and neural networks. , 2004, Medical engineering & physics.

[126]  I. Dryden,et al.  Serum proteomic fingerprinting discriminates between clinical stages and predicts disease progression in melanoma patients. , 2005, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[127]  Raouf N. Gorgui-Naguib,et al.  A fuzzy logic based-method for prognostic decision making in breast and prostate cancers , 2003, IEEE Transactions on Information Technology in Biomedicine.

[128]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[129]  Sorin Draghici,et al.  Gene Expression Profiles Predict Survival and Progression of Pleural Mesothelioma , 2004, Clinical Cancer Research.

[130]  P Laippala,et al.  Breast cancer patients--the support given by nurses. , 1993, Scandinavian journal of caring sciences.

[131]  Pat Langley,et al.  An Analysis of Bayesian Classifiers , 1992, AAAI.

[132]  R. Simes,et al.  Treatment selection for cancer patients: application of statistical decision theory to the treatment of advanced ovarian cancer. , 1985, Journal of chronic diseases.

[133]  Rudolf Kruse,et al.  Fuzzy neural network , 2008, Scholarpedia.

[134]  Tsz-Kwong Man,et al.  Expression profiles of osteosarcoma that can predict response to chemotherapy. , 2005, Cancer research.

[135]  Rodolfo Montironi,et al.  Prostate cancer outcome: epidemiology and biostatistics. , 2005, Analytical and quantitative cytology and histology.

[136]  Stephen T. C. Wong,et al.  Cancer classification and prediction using logistic regression with Bayesian gene selection , 2004, J. Biomed. Informatics.

[137]  J R Palta,et al.  Optimized dose distribution of a high dose rate vaginal cylinder. , 1998, International journal of radiation oncology, biology, physics.

[138]  Dursun Delen,et al.  Predicting breast cancer survivability: a comparison of three data mining methods , 2005, Artif. Intell. Medicine.

[139]  Raouf N. Gorgui-Naguib,et al.  DNA ploidy and cell cycle distribution of breast cancer aspirate cells measured by image cytometry and analyzed by artificial neural networks for their prognostic significance , 1999, IEEE Transactions on Information Technology in Biomedicine.

[140]  A. Roli Artificial Neural Networks , 2012, Lecture Notes in Computer Science.

[141]  Yoshiyuki Matsui,et al.  Predicting disease outcome of non‐invasive transitional cell carcinoma of the urinary bladder using an artificial neural network model: Results of patient follow‐up for 15 years or longer , 2003, International journal of urology : official journal of the Japanese Urological Association.

[142]  Mark A. Stephenson,et al.  Artificial neural networks and logistic regression as tools for prediction of survival in patients with Stages I and II non-small cell lung cancer. , 1998, Modern pathology : an official journal of the United States and Canadian Academy of Pathology, Inc.

[143]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[144]  S. Ishii,et al.  Expression profiling using a tumor-specific cDNA microarray predicts the prognosis of intermediate risk neuroblastomas. , 2005, Cancer cell.