A novel cluster detection of COVID-19 patients and medical disease conditions using improved evolutionary clustering algorithm star

With the increasing number of samples, the manual clustering of COVID-19 and medical disease data samples becomes time-consuming and requires highly skilled labour. Recently, several algorithms have been used for clustering medical datasets deterministically; however, these definitions have not been effective in grouping and analysing medical diseases. The use of evolutionary clustering algorithms may help to effectively cluster these diseases. On this presumption, we improved the current evolutionary clustering algorithm star (ECA*), called iECA*, in three manners: (i) utilising the elbow method to find the correct number of clusters; (ii) cleaning and processing data as part of iECA* to apply it to multivariate and domain-theory datasets; (iii) using iECA* for real-world applications in clustering COVID-19 and medical disease datasets. Experiments were conducted to examine the performance of iECA* against state-of-the-art algorithms using performance and validation measures (validation measures, statistical benchmarking, and performance ranking framework). The results demonstrate three primary findings. First, iECA* was more effective than other algorithms in grouping the chosen medical disease datasets according to the cluster validation criteria. Second, iECA* exhibited the lower execution time and memory consumption for clustering all the datasets, compared to the current clustering methods analysed. Third, an operational framework was proposed to rate the effectiveness of iECA* against other algorithms in the datasets analysed, and the results indicated that iECA* exhibited the best performance in clustering all medical datasets. Further research is required on real-world multi-dimensional data containing complex knowledge fields for experimental verification of iECA* compared to evolutionary algorithms.

[1]  Amit Kumar Das,et al.  A Short Review on Different Clustering Techniques and Their Applications , 2019, Advances in Intelligent Systems and Computing.

[2]  Sina Khanmohammadi,et al.  An improved overlapping k-means clustering method for medical applications , 2017, Expert Syst. Appl..

[3]  Yue Ruan,et al.  Quantum Algorithm for K-Nearest Neighbors Classification Based on the Metric of Hamming Distance , 2017, International Journal of Theoretical Physics.

[4]  M. Fatih Adak,et al.  Classification of E-Nose Aroma Data of Four Fruit Types by ABC-Based Neural Network , 2016, Sensors.

[5]  Panos M. Pardalos,et al.  No Free Lunch Theorem: A Review , 2019, Approximation and Optimization.

[6]  Tarik A. Rashid,et al.  Datasets on statistical analysis and performance evaluation of backtracking search optimisation algorithm compared with its counterpart algorithms , 2019, Data in brief.

[7]  Ming-Yuan Cho,et al.  Feature Selection and Parameters Optimization of SVM Using Particle Swarm Optimization for Fault Classification in Power Distribution Systems , 2017, Comput. Intell. Neurosci..

[8]  Tarik A. Rashid,et al.  Operational framework for recent advances in backtracking search optimisation algorithm: A systematic review and performance evaluation , 2019, Appl. Math. Comput..

[9]  Andries Petrus Engelbrecht,et al.  An overview of clustering methods , 2007, Intell. Data Anal..

[10]  Md Zahidul Islam,et al.  Combining K-Means and a genetic algorithm through a novel arrangement of genetic operators for high quality clustering , 2018, Expert Syst. Appl..

[11]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[12]  Bryar A. Hassan,et al.  CSCF: a chaotic sine cosine firefly algorithm for practical application problems , 2020, Neural Computing and Applications.

[13]  Feng Zou,et al.  Backtracking search optimization algorithm based on knowledge learning , 2019, Inf. Sci..

[14]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Seyed Mohammad Mirjalili,et al.  Performance evaluation results of evolutionary clustering algorithm star for clustering heterogeneous datasets , 2021, Data in brief.

[16]  Ujjwal Maulik,et al.  Genetic algorithm-based clustering technique , 2000, Pattern Recognit..

[17]  Srinandan Dasmahapatra,et al.  Towards Semantic Web: Challenges and Needs , 2016 .

[18]  Wei-Shi Zheng,et al.  Deep kNN for Medical Image Classification , 2020, MICCAI.

[19]  O. N. Oyelade,et al.  Automatic clustering algorithms: a systematic review and bibliometric analysis of relevant literature , 2020, Neural Comput. Appl..

[20]  Derek Greene,et al.  Ensemble clustering in medical diagnostics , 2004, Proceedings. 17th IEEE Symposium on Computer-Based Medical Systems.

[21]  R. Janani,et al.  Text document clustering using Spectral Clustering algorithm with Particle Swarm Optimization , 2019, Expert Syst. Appl..

[22]  Purnima Bholowalia,et al.  EBK-Means: A Clustering Technique based on Elbow Method and K-Means in WSN , 2014 .

[23]  Soran Saeed,et al.  Evaluating e-Government Services in Kurdistan Institution for Strategic Studies and Scientific Research Using the EGOVSAT Model , 2016, ArXiv.

[24]  T. V. Geetha,et al.  A Survey on Crossover Operators , 2016, ACM Comput. Surv..

[25]  William Stafford Noble,et al.  Support vector machine , 2013 .

[26]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[27]  Bryar A. Hassan,et al.  A New Framework to Adopt Multidimensional Databases for Organizational Information System Strategies , 2021, ArXiv.

[28]  Pasi Fränti,et al.  K-means properties on six clustering benchmark datasets , 2018, Applied Intelligence.

[29]  Tarik A. Rashid,et al.  Formal context reduction in deriving concept hierarchies from corpora using adaptive evolutionary clustering algorithm star , 2021, Complex & Intelligent Systems.

[30]  Tarik A. Rashid,et al.  A multidisciplinary ensemble algorithm for clustering heterogeneous datasets , 2021, Neural Comput. Appl..

[31]  Zhongheng Zhang,et al.  Introduction to machine learning: k-nearest neighbors. , 2016, Annals of translational medicine.

[32]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[33]  Sanaz Mostaghim,et al.  Heatmap Visualization of Population Based Multi Objective Algorithms , 2007, EMO.

[34]  Nicholas Gibbins,et al.  Analysis for the Overwhelming Success of the Web Compared to Microcosm and Hyper-G Systems , 2015, ArXiv.

[35]  S. Gunawan,et al.  K-Means Clustering Optimization Using the Elbow Method and Early Centroid Determination Based on Mean and Median Formula , 2020, Proceedings of the 2nd International Seminar on Science and Technology (ISSTEC 2019).

[36]  Bryar A. Hassan,et al.  An Optimized Framework to Adopt Computer Laboratory Administrations for Operating System and Application Installations , 2017, ArXiv.

[37]  Edwin Lughofer A dynamic split-and-merge approach for evolving cluster models , 2012, Evol. Syst..

[38]  Pablo A. Estévez,et al.  A review of learning vector quantization classifiers , 2013, Neural Computing and Applications.

[39]  Tarik A. Rashid,et al.  Artificial Intelligence Algorithms for Natural Language Processing and the Semantic Web Ontology Learning , 2021, ArXiv.

[40]  N. Karthikeyani Visalakshi,et al.  K-means clustering using Max-min distance measure , 2009, NAFIPS 2009 - 2009 Annual Meeting of the North American Fuzzy Information Processing Society.

[41]  Thomas Seidl,et al.  Using internal evaluation measures to validate the quality of diverse stream clustering algorithms , 2017, Vietnam Journal of Computer Science.

[42]  Pinar Çivicioglu,et al.  Backtracking Search Optimization Algorithm for numerical optimization problems , 2013, Appl. Math. Comput..