On the influence of feature selection in fuzzy rule-based regression model generation

Fuzzy rule-based models have been extensively used in regression problems. Besides high accuracy, one of the most appreciated characteristics of these models is their interpretability, which is generally measured in terms of complexity. Complexity is affected by the number of features used for generating the model: the lower the number of features, the lower the complexity. Feature selection can therefore considerably contribute not only to speed up the learning process, but also to improve the interpretability of the final model. Nevertheless, a very few methods for selecting features before learning regression models have been proposed in the literature. In this paper, we focus on these methods, which perform feature selection as pre-processing step. In particular, we have adapted two state-of-the-art feature selection algorithms, namely NMIFS and CFS, originally proposed for classification, to deal with regression. Further, we have proposed FMIFS, a novel forward sequential feature selection approach, based on the minimal-redundancy-maximal-relevance criterion, which can manage directly fuzzy partitions. The relevance and the redundancy of a feature are measured in terms of, respectively, the fuzzy mutual information between the feature and the output variable, and the average fuzzy mutual information between the feature and the just selected features. The stopping criterion for the sequential selection is based on the average values of relevance and redundancy of the just selected features.We have performed two experiments on twenty regression datasets. In the first experiment, we aimed to show the effectiveness of feature selection in fuzzy rule-based regression model generation by comparing the mean square errors achieved by the fuzzy rule-based models generated using all the features, and the features selected by FMIFS, NMIFS and CFS. In order to avoid possible biases related to the specific algorithm, we adopted the well-known Wang and Mendel algorithm for generating the fuzzy rule-based models. We present that the mean square errors obtained by models generated by using the features selected by FMIFS are on average similar to the values achieved by using all the features and lower than the ones obtained by employing the subset of features selected by NMIFS and CFS. In the second experiment, we intended to evaluate how feature selection can reduce the convergence time of the evolutionary fuzzy systems, which are probably the most effective fuzzy techniques for tackling regression problems. By using a state-of-the-art multi-objective evolutionary fuzzy system based on rule learning and membership function tuning, we show that the number of evaluations can be considerably reduced when pre-processing the dataset by feature selection.

[1]  Beatrice Lazzerini,et al.  Learning concurrently data and rule bases of Mamdani fuzzy rule-based systems by exploiting a novel interpretability index , 2011, Soft Comput..

[2]  E. H. Mamdani,et al.  An Experiment in Linguistic Synthesis with a Fuzzy Logic Controller , 1999, Int. J. Man Mach. Stud..

[3]  Anna Walaszek-Babiszewska,et al.  Plenary lecture VII: probability measures of fuzzy events and linguistic fuzzy modelling - forms expressing randomness and imprecision , 2008, ICSE 2008.

[4]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Francisco Herrera,et al.  A proposal for improving the accuracy of linguistic modeling , 2000, IEEE Trans. Fuzzy Syst..

[6]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[7]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[8]  A. K. Jain,et al.  A critical evaluation of intrinsic dimensionality algorithms. , 1980 .

[9]  L. Zadeh Probability measures of Fuzzy events , 1968 .

[10]  Oscar Cordón,et al.  International Journal of Approximate Reasoning a Historical Review of Evolutionary Learning Methods for Mamdani-type Fuzzy Rule-based Systems: Designing Interpretable Genetic Fuzzy Systems , 2022 .

[11]  Francisco Herrera,et al.  A Multiobjective Evolutionary Approach to Concurrently Learn Rule and Data Bases of Linguistic Fuzzy-Rule-Based Systems , 2009, IEEE Transactions on Fuzzy Systems.

[12]  Lloyd A. Smith,et al.  Practical feature subset selection for machine learning , 1998 .

[13]  Bart Kosko,et al.  Fuzzy Systems as Universal Approximators , 1994, IEEE Trans. Computers.

[14]  Francisco Herrera,et al.  A Review of the Application of Multiobjective Evolutionary Fuzzy Systems: Current Status and Further Directions , 2013, IEEE Transactions on Fuzzy Systems.

[15]  George J. Klir,et al.  Fuzzy sets and fuzzy logic , 1995 .

[16]  Ken Kuriyama,et al.  Entropy of a finite partition of fuzzy sets , 1983 .

[17]  Pablo A. Estévez,et al.  A review of feature selection methods based on mutual information , 2013, Neural Computing and Applications.

[18]  Inés Couso,et al.  Mutual information-based feature selection and partition design in fuzzy rule-based classifiers from vague data , 2008, Int. J. Approx. Reason..

[19]  Francisco Herrera,et al.  A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms , 2011, Swarm Evol. Comput..

[20]  José Manuel Benítez,et al.  Empirical study of feature selection methods based on individual feature evaluation for classification problems , 2011, Expert Syst. Appl..

[21]  Pietro Ducange,et al.  Multi-objective Evolutionary Fuzzy Systems , 2011, WILF.

[22]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[23]  R. Iman,et al.  Approximations of the critical region of the fbietkan statistic , 1980 .

[24]  Daren Yu,et al.  Fuzzy Mutual Information Based min-Redundancy and Max-Relevance Heterogeneous Feature Selection , 2011, Int. J. Comput. Intell. Syst..

[25]  José M. Alonso,et al.  Looking for a good fuzzy system interpretability index: An experimental approach , 2009, Int. J. Approx. Reason..

[26]  Jacek M. Zurada,et al.  Normalized Mutual Information Feature Selection , 2009, IEEE Transactions on Neural Networks.

[27]  Nong Ye,et al.  The Handbook of Data Mining , 2003 .

[28]  Amparo Alonso-Betanzos,et al.  Filter Methods for Feature Selection - A Comparative Study , 2007, IDEAL.

[29]  Jerry M. Mendel,et al.  Generating fuzzy rules by learning from examples , 1992, IEEE Trans. Syst. Man Cybern..

[30]  Michel Verleysen,et al.  Mutual information for the selection of relevant variables in spectrometric nonlinear modelling , 2006, ArXiv.

[31]  Yaochu Jin,et al.  Fuzzy modeling of high-dimensional systems: complexity reduction and interpretability improvement , 2000, IEEE Trans. Fuzzy Syst..

[32]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[33]  Marco Laumanns,et al.  PISA: A Platform and Programming Language Independent Interface for Search Algorithms , 2003, EMO.

[34]  Stephen L. Chiu,et al.  Fuzzy Model Identification Based on Cluster Estimation , 1994, J. Intell. Fuzzy Syst..

[35]  Marco Laumanns,et al.  Performance assessment of multiobjective optimizers: an analysis and review , 2003, IEEE Trans. Evol. Comput..

[36]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[37]  Juan Luis Castro,et al.  Fuzzy logic controllers are universal approximators , 1995, IEEE Trans. Syst. Man Cybern..

[38]  Francisco Herrera,et al.  A Fast and Scalable Multiobjective Genetic Fuzzy System for Linguistic Fuzzy Modeling in High-Dimensional Regression Problems , 2011, IEEE Transactions on Fuzzy Systems.

[39]  Michela Antonelli,et al.  Genetic Training Instance Selection in Multiobjective Evolutionary Fuzzy Systems: A Coevolutionary Approach , 2012, IEEE Transactions on Fuzzy Systems.

[40]  Beatrice Lazzerini,et al.  Learning knowledge bases of multi-objective evolutionary fuzzy systems by simultaneously optimizing accuracy, complexity and partition integrity , 2011, Soft Comput..

[41]  Tommy W. S. Chow,et al.  Effective feature selection scheme using mutual information , 2005, Neurocomputing.

[42]  Lei Liu,et al.  Feature selection with dynamic mutual information , 2009, Pattern Recognit..

[43]  Hannu Koivisto,et al.  A Dynamically Constrained Multiobjective Genetic Fuzzy System for Regression Problems , 2010, IEEE Transactions on Fuzzy Systems.

[44]  Francesco Marcelloni,et al.  Feature selection based on a modified fuzzy C-means algorithm with supervision , 2003, Inf. Sci..

[45]  Hisao Ishibuchi,et al.  A simple but powerful heuristic method for generating fuzzy rules from numerical data , 1997, Fuzzy Sets Syst..

[46]  Ivan Kojadinovic,et al.  Relevance measures for subset variable selection in regression problems based on k , 2005, Comput. Stat. Data Anal..

[47]  Gavin Brown,et al.  Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection , 2012, J. Mach. Learn. Res..

[48]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[49]  Filiberto Pla,et al.  Feature Selection in Regression Tasks Using Conditional Mutual Information , 2011, IbPRIA.

[50]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[51]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[52]  Ioannis I. Gerontidis,et al.  Lumpability of absorbing Markov chains and replacement chains on fuzzy partitions , 2010, International Conference on Fuzzy Systems.

[53]  Francisco Herrera,et al.  Interpretability of linguistic fuzzy rule-based systems: An overview of interpretability measures , 2011, Inf. Sci..

[54]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .