Multiobjective feature selection for key quality characteristic identification in production processes using a nondominated-sorting-based whale optimization algorithm

Abstract Identifying key quality characteristics (QCs) in production processes is essential for product quality control and improvement. This paper proposes a multiobjective wrapper-based feature selection (FS) method for key QC (KQC) identification on unbalanced production data using a novel modified nondominated-sorting-based whale optimization algorithm (MNSWOA) and the ideal point method (IPM). In the proposed approach, the FS problem is defined as maximizing the geometric mean (GM) measure and minimizing the feature (QC) subset size. To solve the defined FS problem, MNSWOA is adopted first to find a set of candidate solutions (feature subsets), and then IPM is adopted to select the final solution. In MNSWOA, a modified fast nondominated sorting approach is proposed to adapt the single objective whale optimization algorithm to the multiobjective scenario. Moreover, a uniform reference solution selection strategy and the mutation operations are embedded in MNSWOA to improve its search performance. Experimental results on four unbalanced production datasets show that the proposed FS method performs effectively and efficiently for KQC identification. Further comparisons show that MNSWOA obtains better search performance than benchmark multiobjective optimization methods, including a modified NSGA-II, SPEA2, MOEA/D, NSPSO and CMDPSO.

[1]  Fouad Ben Abdelaziz,et al.  A cooperative swarm intelligence algorithm based on quantum-inspired and rough sets for feature selection , 2018, Comput. Ind. Eng..

[2]  Fugee Tsung,et al.  A Diagnostic Procedure for High-Dimensional Data Streams via Missed Discovery Rate Control , 2019, Technometrics.

[3]  Flavio Sanson Fogliatto,et al.  Variable selection methods in multivariate statistical process control: A systematic literature review , 2018, Comput. Ind. Eng..

[4]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[5]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Ingoo Han,et al.  Hybrid genetic algorithms and support vector machines for bankruptcy prediction , 2006, Expert Syst. Appl..

[7]  M. Punniyamoorthy,et al.  Scaling feature selection method for enhancing the classification performance of Support Vector Machines in text mining , 2018, Comput. Ind. Eng..

[8]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[9]  Eugene Tuv,et al.  Robust, Non-Redundant Feature Selection for Yield Analysis in Semiconductor Manufacturing , 2011, ICDM.

[10]  M. Tahar Kechadi,et al.  Multi-objective feature selection by using NSGA-II for customer churn prediction in telecommunications , 2010, Expert Syst. Appl..

[11]  W. Art Chaovalitwongse,et al.  Multicriteria variable selection for classification of production batches , 2012, Eur. J. Oper. Res..

[12]  Kalyanmoy Deb,et al.  An Evolutionary Many-Objective Optimization Algorithm Using Reference-Point-Based Nondominated Sorting Approach, Part I: Solving Problems With Box Constraints , 2014, IEEE Transactions on Evolutionary Computation.

[13]  Minping Jia,et al.  Intelligent fault diagnosis of rotating machinery using improved multiscale dispersion entropy and mRMR feature selection , 2019, Knowl. Based Syst..

[14]  Hao Tian,et al.  A new feature extraction and selection scheme for hybrid fault diagnosis of gearbox , 2011, Expert Syst. Appl..

[15]  Yong Zhang,et al.  Cost-sensitive feature selection using two-archive multi-objective artificial bee colony algorithm , 2019, Expert Syst. Appl..

[16]  Xin Yao,et al.  A New Dominance Relation-Based Evolutionary Algorithm for Many-Objective Optimization , 2016, IEEE Transactions on Evolutionary Computation.

[17]  Majdi M. Mafarja,et al.  Hybrid Whale Optimization Algorithm with simulated annealing for feature selection , 2017, Neurocomputing.

[18]  Yang Zhang,et al.  Key quality characteristics selection for imbalanced production data using a two-phase bi-objective feature selection method , 2019, Eur. J. Oper. Res..

[19]  Kemal Kilic,et al.  A Novel Hybrid Genetic Local Search Algorithm for Feature Selection and Weighting with an Application in Strategic Decision Making in Innovation Management , 2017, Inf. Sci..

[20]  Andrew Lewis,et al.  The Whale Optimization Algorithm , 2016, Adv. Eng. Softw..

[21]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[22]  Mengjie Zhang,et al.  New mechanism for archive maintenance in PSO-based multi-objective feature selection , 2016, Soft Comput..

[23]  Yang Zhang,et al.  Bi-objective variable selection for key quality characteristics selection based on a modified NSGA-II and the ideal point method , 2016, Comput. Ind..

[24]  Satish T. S. Bukkapatnam,et al.  The internet of things for smart manufacturing: A review , 2019, IISE Trans..

[25]  Chee Peng Lim,et al.  A multi-objective evolutionary algorithm-based ensemble optimizer for feature selection and classification with neural network models , 2014, Neurocomputing.

[26]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  S. C. Neoh,et al.  A Micro-GA Embedded PSO Feature Selection Approach to Intelligent Facial Emotion Recognition , 2017, IEEE Transactions on Cybernetics.

[28]  Zhen He,et al.  LASSO-based diagnosis scheme for multistage processes with binary data , 2014, Comput. Ind. Eng..

[29]  Mengjie Zhang,et al.  Multi-objective feature selection using hybridization of a genetic algorithm and direct multisearch for key quality characteristic selection , 2020, Inf. Sci..

[30]  Behrouz Minaei-Bidgoli,et al.  Optimizing multi-objective PSO based feature selection method using a feature elitism mechanism , 2018, Expert Syst. Appl..

[31]  Yudong Zhang,et al.  Binary PSO with mutation operator for feature selection using decision tree applied to spam detection , 2014, Knowl. Based Syst..

[32]  Mark Johnston,et al.  Evolving Diverse Ensembles Using Genetic Programming for Classification With Unbalanced Data , 2013, IEEE Transactions on Evolutionary Computation.

[33]  Antonio Martínez-Álvarez,et al.  Feature selection by multi-objective optimisation: Application to network anomaly detection by hierarchical self-organising maps , 2014, Knowl. Based Syst..

[34]  Zhong Ming,et al.  An improved NSGA-III algorithm for feature selection used in intrusion detection , 2017, Knowl. Based Syst..

[35]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[36]  Xin Yao,et al.  A Survey on Evolutionary Computation Approaches to Feature Selection , 2016, IEEE Transactions on Evolutionary Computation.

[37]  Jean-Pierre Gauchi,et al.  Comparison of selection methods of explanatory variables in PLS regression with application to manufacturing process data , 2001 .

[38]  Mengjie Zhang,et al.  Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms , 2014, Appl. Soft Comput..

[39]  Mengjie Zhang,et al.  Pareto front feature selection based on artificial bee colony optimization , 2018, Inf. Sci..

[40]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[41]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[42]  M. Freimer,et al.  Some New Results on Compromise Solutions for Group Decision Problems , 1976 .

[43]  Seyed Mohammad Mirjalili,et al.  Whale optimization approaches for wrapper feature selection , 2018, Appl. Soft Comput..

[44]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[45]  Héctor Pomares,et al.  Parallel multiobjective memetic RBFNNs design and feature selection for function approximation problems , 2009, Neurocomputing.

[46]  Edoardo Amaldi,et al.  On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems , 1998, Theor. Comput. Sci..

[47]  Qingfu Zhang,et al.  MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition , 2007, IEEE Transactions on Evolutionary Computation.

[48]  Francisco Herrera,et al.  An Evolutionary Multiobjective Model and Instance Selection for Support Vector Machines With Pareto-Based Ensembles , 2017, IEEE Transactions on Evolutionary Computation.

[49]  Joaquín A. Pacheco,et al.  Bi-objective feature selection for discriminant analysis in two-class classification , 2013, Knowl. Based Syst..

[50]  Bart Baesens,et al.  A multi-objective approach for profit-driven feature selection in credit scoring , 2019, Decis. Support Syst..

[51]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[52]  Michel J. Anzanello,et al.  Selecting the best variables for classifying production batches into two quality levels , 2009 .

[53]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[54]  Lothar Thiele,et al.  Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach , 1999, IEEE Trans. Evol. Comput..

[55]  Asif Ekbal,et al.  Multiobjective optimization for classifier ensemble and feature selection: an application to named entity recognition , 2011, International Journal on Document Analysis and Recognition (IJDAR).

[56]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[57]  Hai Jiang,et al.  On feature selection in network flow based traffic sign tracking models , 2019 .

[58]  Seyed Mohammad Mirjalili,et al.  A hyper-heuristic for improving the initial population of whale optimization algorithm , 2019, Knowl. Based Syst..

[59]  Mengjie Zhang,et al.  Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach , 2013, IEEE Transactions on Cybernetics.