Analytics of Heterogeneous Breast Cancer Data Using Neuroevolution

Breast cancer prognostic modeling is difficult since it is governed by many diverse factors. Given the low median survival and large scale breast cancer data, which comes from high throughput technology, the accurate and reliable prognosis of breast cancer is becoming increasingly difficult. While accurate and timely prognosis may save many patients from going through painful and expensive treatments, it may also help oncologists in managing the disease more efficiently and effectively. Data analytics augmented by machine-learning algorithms have been proposed in past for breast cancer prognosis; and however, most of these could not perform well owing to the heterogeneous nature of available data and model interpretability related issues. A robust prognostic modeling approach is proposed here whereby a Pareto optimal set of deep neural networks (DNNs) exhibiting equally good performance metrics is obtained. The set of DNNs is initialized and their hyperparameters are optimized using the evolutionary algorithm, NSGAIII. The final DNN model is selected from the Pareto optimal set of many DNNs using a fuzzy inferencing approach. Contrary to using DNNs as the black box, the proposed scheme allows understanding how various performance metrics (such as accuracy, sensitivity, F1, and so on) change with changes in hyper-parameters. This enhanced interpretability can be further used to improve or modify the behavior of DNNs. The heterogeneous breast cancer database requires preprocessing for better interpretation of categorical variables in order to improve prognosis from classifiers. Furthermore, we propose to use a neural network-based entity-embedding method for categorical features with high cardinality. This approach can provide a vector representation of categorical features in multidimensional space with enhanced interpretability. It is shown with evidence that DNNs optimized using evolutionary algorithms exhibit improved performance over other classifiers mentioned in this paper.

[1]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[2]  Jim Duggan,et al.  A multi-objective neural network trained with differential evolution for dynamic economic emission dispatch , 2018, International Journal of Electrical Power & Energy Systems.

[3]  J.M. Kontoleon Optimum Link Allocation of Fixed Topology Networks , 1979, IEEE Transactions on Reliability.

[4]  Balázs Kégl,et al.  Similarity encoding for learning with dirty categorical variables , 2018, Machine Learning.

[5]  Geoffrey E. Hinton,et al.  Learning Distributed Representations of Concepts Using Linear Relational Embedding , 2001, IEEE Trans. Knowl. Data Eng..

[6]  Kalyanmoy Deb,et al.  An Evolutionary Many-Objective Optimization Algorithm Using Reference-Point-Based Nondominated Sorting Approach, Part I: Solving Problems With Box Constraints , 2014, IEEE Transactions on Evolutionary Computation.

[7]  Patricia Melin,et al.  Multi-objective optimization for modular granular neural networks applied to pattern recognition , 2017, Inf. Sci..

[8]  Kaoutar Senhaji,et al.  Multilayer Perceptron: NSGA II for a New Multi-objective Learning Method for Training and Model Complexity , 2017 .

[9]  Xin Yao,et al.  Evolving artificial neural networks , 1999, Proc. IEEE.

[10]  Kay C. Wiese,et al.  EvoNN: a customizable evolutionary neural network with heterogenous activation functions , 2018, GECCO.

[11]  Xiuzhen Zhang,et al.  A Word-Character Convolutional Neural Network for Language-Agnostic Twitter Sentiment Analysis , 2017, ADCS.

[12]  Prashant K. Jamwal,et al.  Evolutionary Optimization Using Equitable Fuzzy Sorting Genetic Algorithm (EFSGA) , 2019, IEEE Access.

[13]  Bhabesh Nath,et al.  Identifying risk factors for adverse diseases using dynamic rare association rule mining , 2018, Expert Syst. Appl..

[14]  Yan Liu,et al.  Benchmarking deep learning models on large healthcare datasets , 2018, J. Biomed. Informatics.

[15]  Stacy M Cowherd,et al.  Tumor staging and grading: a primer. , 2012, Methods in molecular biology.

[16]  K. Bandeen-Roche,et al.  Predictors of Patterns of Pain, Fatigue, and Insomnia During the First Year After a Cancer Diagnosis in the Elderly , 2008, Cancer nursing.

[17]  Yu Wang,et al.  Research on Rotor Position Model for Switched Reluctance Motor Using Neural Network , 2018, IEEE/ASME Transactions on Mechatronics.

[18]  Sung Wook Baik,et al.  Efficient Conversion of Deep Features to Compact Binary Codes Using Fourier Decomposition for Multimedia Big Data , 2018, IEEE Transactions on Industrial Informatics.

[19]  G. Muscogiuri,et al.  Obesity and breast cancer in premenopausal women: Current evidence and future perspectives. , 2018, European journal of obstetrics, gynecology, and reproductive biology.

[20]  Ali Montazeri,et al.  What do predict anxiety and depression in breast cancer patients? A follow-up study , 2009, Social Psychiatry and Psychiatric Epidemiology.

[21]  C. Mathers,et al.  Cancer incidence and mortality worldwide: Sources, methods and major patterns in GLOBOCAN 2012 , 2015, International journal of cancer.

[22]  C. V. Jawahar,et al.  Improving multiclass classification by deep networks using DAGSVM and Triplet Loss , 2018, Pattern Recognit. Lett..

[23]  Xin Yao,et al.  A new evolutionary system for evolving artificial neural networks , 1997, IEEE Trans. Neural Networks.

[24]  Alexander M. Rush,et al.  Character-Aware Neural Language Models , 2015, AAAI.

[25]  Kalyanmoy Deb,et al.  An Evolutionary Many-Objective Optimization Algorithm Using Reference-Point Based Nondominated Sorting Approach, Part II: Handling Constraints and Extending to an Adaptive Approach , 2014, IEEE Transactions on Evolutionary Computation.

[26]  L. Eriksson,et al.  Prognosis in patients diagnosed with loco-regional failure of breast cancer: 34 years longitudinal data from the Stockholm–Gotland cancer registry , 2018, Breast Cancer Research and Treatment.

[27]  Jiye Liang,et al.  An Algorithm for Clustering Categorical Data With Set-Valued Features , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Eckart Zitzler,et al.  HypE: An Algorithm for Fast Hypervolume-Based Many-Objective Optimization , 2011, Evolutionary Computation.

[29]  Qingfu Zhang,et al.  Multiobjective Optimization Problems With Complicated Pareto Sets, MOEA/D and NSGA-II , 2009, IEEE Transactions on Evolutionary Computation.

[30]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[31]  Hongfang Liu,et al.  A Comparison of Word Embeddings for the Biomedical Natural Language Processing , 2018, J. Biomed. Informatics.

[32]  José M. Cecilia,et al.  Multi-objective optimal design of submerged arches using extreme learning machine and evolutionary algorithms , 2018, Appl. Soft Comput..

[33]  Nenad Filipovic,et al.  Prediction models for estimation of survival rate and relapse for breast cancer patients , 2015, 2015 IEEE 15th International Conference on Bioinformatics and Bioengineering (BIBE).

[34]  Shuiping Gao,et al.  Comparison of patterns and prognosis among distant metastatic breast cancer patients by age groups: a SEER population-based analysis , 2017, Scientific Reports.

[35]  Alessandro Sperduti,et al.  A general framework for adaptive processing of data structures , 1998, IEEE Trans. Neural Networks.

[36]  Ajith Abraham,et al.  Optimal breast cancer classification using Gauss-Newton representation based algorithm , 2017, Expert Syst. Appl..

[37]  Jing Li,et al.  SD-CNN: a Shallow-Deep CNN for Improved Breast Cancer Diagnosis , 2018, Comput. Medical Imaging Graph..

[38]  Jinsung Yoon,et al.  Discovery and Clinical Decision Support for Personalized Healthcare , 2017, IEEE Journal of Biomedical and Health Informatics.

[39]  Cheng Guo,et al.  Entity Embeddings of Categorical Variables , 2016, ArXiv.

[40]  Enrique Hortal,et al.  Using a brain-machine interface to control a hybrid upper limb exoskeleton during rehabilitation of patients with neurological conditions , 2015, Journal of NeuroEngineering and Rehabilitation.

[41]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[42]  Mohamad Ivan Fanany,et al.  Fast Convolutional Method for Automatic Sleep Stage Classification , 2018, Healthcare informatics research.

[43]  Kalyanmoy Deb,et al.  Optimization for Engineering Design: Algorithms and Examples , 2004 .

[44]  Bu-Sung Lee,et al.  Evolutionary multi-objective optimization based ensemble autoencoders for image outlier detection , 2018, Neurocomputing.

[45]  Dimitris Kalaitzopoulos The potential of Precision Medicine , 2016 .

[46]  Antoine Cully,et al.  Evolving a Behavioral Repertoire for a Walking Robot , 2013, Evolutionary Computation.

[47]  Ting Liu,et al.  Recent advances in convolutional neural networks , 2015, Pattern Recognit..

[48]  Wenbin Zou,et al.  An End-to-End Deep Learning Histochemical Scoring System for Breast Cancer TMA , 2018, IEEE Transactions on Medical Imaging.

[49]  Tomasz Praczyk,et al.  Cooperative co-evolutionary neural networks , 2016, J. Intell. Fuzzy Syst..

[50]  D. O'Leary,et al.  Influence of complications following immediate breast reconstruction on breast cancer recurrence rates , 2016, The British journal of surgery.

[51]  Joshua Evan Auerbach,et al.  Evolving complete robots with CPPN-NEAT: the utility of recurrent connections , 2011, GECCO '11.

[52]  Alexandre C. B. Delbem,et al.  Neuroevolution for solving multiobjective knapsack problems , 2019, Expert Syst. Appl..

[53]  Qingfu Zhang,et al.  MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition , 2007, IEEE Transactions on Evolutionary Computation.

[54]  Parag C. Pendharkar,et al.  An empirical study of design and testing of hybrid evolutionary–neural approach for classification , 2001 .

[55]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[56]  Amir Hossein Alavi,et al.  An improved NSGA-III algorithm with adaptive mutation operator for Big Data optimization problems , 2018, Future Gener. Comput. Syst..