Dimensionality Reduction in Complex Medical Data: Improved Self-Adaptive Niche Genetic Algorithm

With the development of medical technology, more and more parameters are produced to describe the human physiological condition, forming high-dimensional clinical datasets. In clinical analysis, data are commonly utilized to establish mathematical models and carry out classification. High-dimensional clinical data will increase the complexity of classification, which is often utilized in the models, and thus reduce efficiency. The Niche Genetic Algorithm (NGA) is an excellent algorithm for dimensionality reduction. However, in the conventional NGA, the niche distance parameter is set in advance, which prevents it from adjusting to the environment. In this paper, an Improved Niche Genetic Algorithm (INGA) is introduced. It employs a self-adaptive niche-culling operation in the construction of the niche environment to improve the population diversity and prevent local optimal solutions. The INGA was verified in a stratification model for sepsis patients. The results show that, by applying INGA, the feature dimensionality of datasets was reduced from 77 to 10 and that the model achieved an accuracy of 92% in predicting 28-day death in sepsis patients, which is significantly higher than other methods.

[1]  Qing Ling,et al.  Crowding clustering genetic algorithm for multimodal function optimization , 2008, Appl. Soft Comput..

[2]  H. Takagi,et al.  Accelerating a GA convergence by fitting a single-peak function , 1999 .

[3]  Sajid Ali Khan,et al.  Optimized features selection using hybrid PSO-GA for multi-view gender classification , 2015, Int. Arab J. Inf. Technol..

[4]  H. Takagi,et al.  Accelerating a GA convergence by fitting a single-peak function , 2003, FUZZ-IEEE'99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315).

[5]  Multimodal Problems , Premature Convergence versus Computation Effort in Dynamic Design Optimization , 2011 .

[6]  Bruno Sareni,et al.  Genetic Algorithms for Optimization in Electromagnetics I. Fundamentals , 1998 .

[7]  Rebecca Willett,et al.  Poisson Noise Reduction with Non-local PCA , 2012, Journal of Mathematical Imaging and Vision.

[8]  Jiankun Hu,et al.  A New Dimensionality Reduction Algorithm for Hyperspectral Image Using Evolutionary Strategy , 2012, IEEE Transactions on Industrial Informatics.

[9]  Anil K. Jain,et al.  Dimensionality reduction using genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[10]  Fabian Ewald Fassnacht,et al.  An angular vegetation index for imaging spectroscopy data - Preliminary results on forest damage detection in the Bavarian National Park, Germany , 2012, Int. J. Appl. Earth Obs. Geoinformation.

[11]  Henry Leung,et al.  A Genetic Algorithm-Inspired UUV Path Planner Based on Dynamic Programming , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[12]  A. Simpson,et al.  An Improved Genetic Algorithm for Pipe Network Optimization , 1996 .

[13]  John A. W. McCall,et al.  An application of a GA with Markov network surrogate to feature selection , 2013, Int. J. Syst. Sci..

[14]  Tianyou Chai,et al.  Multi-frequency signal modeling using empirical mode decomposition and PCA with application to mill load estimation , 2015, Neurocomputing.

[15]  Tung-Kuan Liu,et al.  A Novel Crowding Genetic Algorithm and Its Applications to Manufacturing Robots , 2014, IEEE Transactions on Industrial Informatics.

[16]  Chen-Chien Hsu,et al.  Optimal path planning incorporating global and local search for mobile robots , 2012, The 1st IEEE Global Conference on Consumer Electronics 2012.

[17]  Saman K. Halgamuge,et al.  Self-organizing hierarchical particle swarm optimizer with time-varying acceleration coefficients , 2004, IEEE Transactions on Evolutionary Computation.

[18]  Arthur C. Sanderson,et al.  Planning multiple paths with evolutionary speciation , 2001, IEEE Trans. Evol. Comput..

[19]  Pramod Kumar Singh,et al.  A three-stage unsupervised dimension reduction method for text clustering , 2014, J. Comput. Sci..

[20]  M. Tsai,et al.  RIFLE classification can predict short-term prognosis in critically ill cirrhotic patients , 2007, Intensive Care Medicine.

[21]  J. Stuart Aitken,et al.  Feature selection and classification for microarray data analysis: Evolutionary methods for identifying predictive genes , 2005, BMC Bioinformatics.

[22]  Kwong-Sak Leung,et al.  A genetic algorithm based on mutation and crossover with adaptive probabilities , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[23]  Rafael Bello,et al.  Two-Step Particle Swarm Optimization to Solve the Feature Selection Problem , 2007, Seventh International Conference on Intelligent Systems Design and Applications (ISDA 2007).

[24]  Kalyanmoy Deb,et al.  A Fast Elitist Non-dominated Sorting Genetic Algorithm for Multi-objective Optimisation: NSGA-II , 2000, PPSN.

[25]  M. Meisner,et al.  Decreases in procalcitonin and C-reactive protein are strong predictors of survival in ventilator-associated pneumonia , 2006, Critical care.

[26]  Valliammal Narayan,et al.  An optimal feature subset selection using GA for leaf classification , 2014, Int. Arab J. Inf. Technol..

[27]  Abdullah Konak,et al.  A Game-Theoretic Genetic Algorithm for the reliable server assignment problem under attacks , 2015, Comput. Ind. Eng..

[28]  Ruy Guilherme Rodrigues Cal,et al.  Brazilian Sepsis Epidemiological Study (BASES study) , 2004, Critical care.

[29]  B. Koch,et al.  Non-parametric prediction and mapping of standing timber volume and biomass in a temperate forest: application of multiple optical/LiDAR-derived predictors , 2010 .

[30]  Takio Kurita,et al.  Selection of Import Vectors via Binary Particle Swarm Optimization and Cross-Validation for Kernel Logistic Regression , 2007, 2007 International Joint Conference on Neural Networks.

[31]  Bruno Sareni,et al.  Fitness sharing and niching methods revisited , 1998, IEEE Trans. Evol. Comput..

[32]  K. Ho Combining Sequential Organ Failure Assessment (SOFA) Score with Acute Physiology and Chronic Health Evaluation (APACHE) II Score to Predict Hospital Mortality of Critically Ill Patients , 2007, Anaesthesia and intensive care.

[33]  D Napoleon,et al.  A New Method for Dimensionality Reduction using K- Means Clustering Algorithm for High Dimensional Data Set , 2011 .

[34]  David E. Goldberg,et al.  A niched Pareto genetic algorithm for multiobjective optimization , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[35]  Juan Martínez,et al.  Genetic algorithms for the design of looped irrigation water distribution networks , 2006 .

[36]  Max Mignotte,et al.  MDS-Based Multiresolution Nonlinear Dimensionality Reduction Model for Color Image Segmentation , 2011, IEEE Transactions on Neural Networks.

[37]  Michael J. Shaw,et al.  Genetic algorithms with dynamic niche sharing for multimodal function optimization , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[38]  V. Bajic,et al.  DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm , 2015, PloS one.

[39]  U. Sungurtekin,et al.  Usefulness of procalcitonin for diagnosis of sepsis in the intensive care unit , 2002, Critical care.