Identification of cancer diagnosis estimation models using evolutionary algorithms: a case study for breast cancer, melanoma, and cancer in the respiratory system

In this paper we present results of empirical research work done on the data based identification of estimation models for cancer diagnoses: Based on patients' data records including standard blood parameters, tumor markers, and information about the diagnosis of tumors we have trained mathematical models for estimating cancer diagnoses. Several data based modeling approaches implemented in HeuristicLab have been applied for identifying estimators for selected cancer diagnoses: Linear regression, k-nearest neighbor learning, artificial neural networks, and support vector machines (all optimized using evolutionary algorithms) as well as genetic programming. The investigated diagnoses of breast cancer, melanoma, and respiratory system cancer can be estimated correctly in up to 81%, 74%, and 91% of the analyzed test cases, respectively; without tumor markers up to 75%, 74%, and 87% of the test samples are correctly estimated, respectively.

[1]  David Chia,et al.  Mortality results from a randomized prostate-cancer screening trial. , 2009, The New England journal of medicine.

[2]  W. Banzhaf,et al.  Genetic Programming of an Algorithmic Chemistry , 2005 .

[3]  Y. Niv,et al.  MUC1 and colorectal cancer pathophysiology considerations. , 2008, World journal of gastroenterology.

[4]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[5]  Alex Simpkins,et al.  System Identification: Theory for the User, 2nd Edition (Ljung, L.; 1999) [On the Shelf] , 2012, IEEE Robotics & Automation Magazine.

[6]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[7]  M. Herlyn,et al.  Specific antigen in serum of patients with colon carcinoma. , 1981, Science.

[8]  B. Yin,et al.  Ovarian cancer antigen CA125 is encoded by the MUC16 mucin gene , 2002, International journal of cancer.

[9]  Michael Affenzeller,et al.  SASEGASA: A New Generic Parallel Evolutionary Algorithm for Achieving Highest Quality Results , 2004, J. Heuristics.

[10]  R. Kreienberg,et al.  [The importance of the SCC antigen in the diagnosis and follow-up of cervix carcinoma. A cooperative study of the Gynecologic Tumor Marker Group (GTMG)]. , 1989, Deutsche medizinische Wochenschrift.

[11]  Stephan M. Winkler,et al.  Evolutionary System Identification , 2009 .

[12]  Wendy Johnson,et al.  Introduction to Evolutionary Computation (lesson & activity) , 2012 .

[13]  M. Duffy,et al.  A personalized approach to cancer treatment: how biomarkers can help. , 2008, Clinical chemistry.

[14]  Stefan Wagner,et al.  SexualGA: Gender-Specific Selection for Genetic Algorithms , 2005 .

[15]  Witold Jacak,et al.  Classification of tumor marker values using heuristic data mining methods , 2010, GECCO '10.

[16]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[17]  Phil Gold,et al.  DEMONSTRATION OF TUMOR-SPECIFIC ANTIGENS IN HUMAN COLONIC CARCINOMATA BY IMMUNOLOGICAL TOLERANCE AND ABSORPTION TECHNIQUES , 1965, The Journal of experimental medicine.

[18]  G. Mizejewski,et al.  Alpha-fetoprotein Structure and Function: Relevance to Isoforms, Epitopes, and Conformational Variants , 2001, Experimental biology and medicine.

[19]  P. Lee,et al.  Evaluation of cytokeratin 19 fragment (CYFRA 21-1) as a tumor marker in malignant pleural effusion. , 1999, Japanese journal of clinical oncology.

[20]  J A Koepke,et al.  Molecular marker test standardization , 1992, Cancer.

[21]  P. Nelson,et al.  Molecular characterization of prostatic small‐cell neuroendocrine carcinoma , 2003, The Prostate.

[22]  Enrique Alba,et al.  Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms , 2007, 2007 IEEE Congress on Evolutionary Computation.

[23]  Joachim Schneider,et al.  Cut-off-independent tumour marker evaluation using ROC approximation. , 2007, Anticancer research.

[24]  O. Nelles Nonlinear System Identification , 2001 .

[25]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[26]  Yasuhiro Fujiwara,et al.  Tumor-marker analysis and verification of prognostic models in patients with cancer of unknown primary, receiving platinum-based combination chemotherapy , 2006, Journal of Cancer Research and Clinical Oncology.

[27]  Stephan M. Winkler,et al.  Genetic Algorithms and Genetic Programming - Modern Concepts and Practical Applications , 2009 .

[28]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[29]  M. El-Sharkawi,et al.  Introduction to Evolutionary Computation , 2008 .

[30]  Frey Bm,et al.  Clinical Assessment of the New Tumor Marker TPS , 1994 .

[31]  David G. Stork,et al.  Pattern Classification , 1973 .

[32]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[33]  H. Gray Gray's Anatomy , 1858 .

[34]  Luis Chiriboga,et al.  Differential expression of S100 protein subtypes in malignant melanoma, and benign and malignant peripheral nerve sheath tumors , 2008, Journal of cutaneous pathology.

[35]  Myrna LaFleur-Brooks,et al.  Exploring medical language: A student-directed approach , 1985 .

[36]  Witold Jacak,et al.  Feature selection in the analysis of tumor marker data using evolutionary algorithms , 2010 .

[37]  Zhi-Yuan Zhang,et al.  [Application of serum tumor markers and support vector machine in the diagnosis of oral squamous cell carcinoma]. , 2008, Shanghai kou qiang yi xue = Shanghai journal of stomatology.

[38]  E. Fung,et al.  Proteomic approaches to tumor marker discovery. , 2002, Archives of pathology & laboratory medicine.

[39]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .