A collaborative decision support system for multi-criteria automatic clustering

Abstract Automatic clustering is a challenging problem, especially when the decision-maker has little or no information about the nature of the dataset and the criteria of interest. There is a lack of generalizability in the current validity indexes (VI) for automatic clustering algorithms, as each considers a limited number of objectives and mostly ignores the other aspects of clustering validation. The proposed framework benefits from collaboration among selected evolutionary algorithms. A mixed-integer non-linear programming model is developed, and a framework is proposed for a six-step decision support system to solve it. The decision-maker (DM) selects the quantitative (primary) VIs and the evolutionary algorithms. Given DM's knowledge on the dataset and VIs, DM can incorporate qualitative (secondary) VIs. DM determines the quality threshold for each VI and runs the evolutionary algorithms separately. The DSS then saves the best obtained value of VIs in order to prepare the input necessary to construct the aggregated function. Based on the selected primary VIs, a new normalized aggregated function is developed and solved repeatedly using the randomly selected or predefined weights of importance. Eventually, DM employs a proper DEA model to define the final clustering output among all possible solutions. Given multiple efficient solutions, the best-worst method and a multi-criteria decision-making approach are applied to find the final output. The applicability of the proposed approach is illustrated on a synthetic and two secondary datasets, and the result at each step is discussed in detail.

[1]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Martijn R.K. Mes,et al.  Forecasting demand profiles of new products , 2020, Decis. Support Syst..

[3]  M. Tahar Kechadi,et al.  Automatic multi-objective clustering based on game theory , 2017, Expert Syst. Appl..

[4]  Chunguang Bai,et al.  Banking credit worthiness: Evaluating the complex relationships , 2018, Omega.

[5]  Wilfrido Gómez-Flores,et al.  Automatic clustering using nature-inspired metaheuristics: A survey , 2016, Appl. Soft Comput..

[6]  Kate Smith-Miles,et al.  A mathematical programming approach to optimise insurance premium pricing within a data mining framework , 2002, J. Oper. Res. Soc..

[7]  Nargis Pervin,et al.  Towards generating scalable personalized recommendations: Integrating social trust, social bias, and geo-spatial clustering , 2019, Decis. Support Syst..

[8]  Erik D. Goodman,et al.  Evolutionary multi-objective automatic clustering enhanced with quality metrics and ensemble strategy , 2020, Knowl. Based Syst..

[9]  He Li,et al.  Multi-objective evolutionary clustering for large-scale dynamic community detection , 2021, Inf. Sci..

[10]  Daming Shi,et al.  Multi-objective evolutionary clustering with complex networks , 2021, Expert Syst. Appl..

[11]  Jaya Sil,et al.  Automatic clustering by multi-objective genetic algorithm with numeric and categorical features , 2019, Expert Syst. Appl..

[12]  Taha Mokfi,et al.  Evaluation and selection of clustering methods using a hybrid group MCDM , 2019, Expert Syst. Appl..

[13]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[14]  J. Gower A General Coefficient of Similarity and Some of Its Properties , 1971 .

[15]  Hong He,et al.  A two-stage genetic algorithm for automatic clustering , 2012, Neurocomputing.

[16]  Matthew D. Bailey,et al.  An optimization-based DSS for student-to-teacher assignment: Classroom heterogeneity and teacher performance measures , 2019, Decis. Support Syst..

[17]  Sebastián Maldonado,et al.  Telecom traffic pumping analytics via explainable data science , 2021, Decis. Support Syst..

[18]  Feng Li,et al.  Multi-objective artificial immune algorithm for fuzzy clustering based on multiple kernels , 2019, Swarm Evol. Comput..

[19]  Baidyanath Biswas,et al.  Examining the determinants of the count of customer reviews in peer-to-peer home-sharing platforms using clustering and count regression techniques , 2020, Decis. Support Syst..

[20]  Sanghamitra Bandyopadhyay,et al.  Some connectivity based cluster validity indices , 2012, Appl. Soft Comput..

[21]  Joshua D. Knowles,et al.  An Evolutionary Approach to Multiobjective Clustering , 2007, IEEE Transactions on Evolutionary Computation.

[22]  J. Rezaei Best-worst multi-criteria decision-making method , 2015 .

[23]  Jonathan F. Bard,et al.  Pickup and delivery network segmentation using contiguous geographic clustering , 2011, J. Oper. Res. Soc..

[24]  Dursun Delen,et al.  Clustering temporal disease networks to assist clinical decision support systems in visual analytics of comorbidity progression , 2021, Decis. Support Syst..

[25]  Adriano Lorena Inácio de Oliveira,et al.  Automatic trading method based on piecewise aggregate approximation and multi-swarm of improved self-adaptive particle swarm optimization with validation , 2017, Decis. Support Syst..

[26]  Shen-Tsu Wang,et al.  Integrating KPSO and C5.0 to analyze the omnichannel solutions for optimizing telecommunication retail , 2017, Decis. Support Syst..

[27]  Ka Yee Yeung,et al.  Validating clustering for gene expression data , 2001, Bioinform..

[28]  Ganapati Panda,et al.  Automatic clustering algorithm based on multi-objective Immunized PSO to classify actions of 3D human models , 2013, Eng. Appl. Artif. Intell..

[29]  Tomoharu Nagao,et al.  Evolutionary image segmentation based on multiobjective clustering , 2009, 2009 IEEE Congress on Evolutionary Computation.

[30]  Jayant Rajgopal,et al.  Optimizing vaccine distribution networks in low and middle-income countries , 2019, Omega.

[31]  Abraham Charnes,et al.  Measuring the efficiency of decision making units , 1978 .

[32]  O. A. Olanrewaju,et al.  Integrated IDA–ANN–DEA for assessment and optimization of energy consumption in industrial sectors , 2012 .

[33]  Hisao Ishibuchi,et al.  Multi-clustering via evolutionary multi-objective optimization , 2018, Inf. Sci..

[34]  M.-C. Su,et al.  A new cluster validity measure and its application to image compression , 2004, Pattern Analysis and Applications.

[35]  Abdelaziz I. Hammouri,et al.  Comparison between compactness and connectedness criteria in data clustering , 2017 .

[36]  Qi Zhao,et al.  Reference vector-based multi-objective clustering for high-dimensional data , 2019, Appl. Soft Comput..

[37]  R. J. Kuo,et al.  Multi-objective cluster analysis using a gradient evolution algorithm , 2020, Soft Comput..

[38]  Rahul C. Basole,et al.  Visual analytics for supply network management: System design and evaluation , 2016, Decis. Support Syst..

[39]  Nicandro Cruz-Ramírez,et al.  Improved multi-objective clustering with automatic determination of the number of clusters , 2016, Neural Computing and Applications.

[40]  Holmes Finch,et al.  Comparison of Distance Measures in Cluster Analysis with Dichotomous Data , 2021, Journal of Data Science.

[41]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[42]  Ali Dag,et al.  A Bayesian Belief Network-based probabilistic mechanism to determine patient no-show risk categories , 2020 .

[43]  Sriparna Saha,et al.  A generalized automatic clustering algorithm in a multiobjective framework , 2013, Appl. Soft Comput..

[44]  Joachim M. Buhmann,et al.  Stability-Based Model Order Selection in Clustering with Applications to Gene Expression Data , 2002, ICANN.