Distributed Fuzzy Cognitive Maps for Feature Selection in Big Data Classification

The features of a dataset play an important role in the construction of a machine learning model. Because big datasets often have a large number of features, they may contain features that are less relevant to the machine learning task, which makes the process more time-consuming and complex. In order to facilitate learning, it is always recommended to remove the less significant features. The process of eliminating the irrelevant features and finding an optimal feature set involves comprehensively searching the dataset and considering every subset in the data. In this research, we present a distributed fuzzy cognitive map based learning-based wrapper method for feature selection that is able to extract those features from a dataset that play the most significant role in decision making. Fuzzy cognitive maps (FCMs) represent a hybrid computing technique combining elements of both fuzzy logic and cognitive maps. Using Spark’s resilient distributed datasets (RDDs), the proposed model can work effectively in a distributed manner for quick, in-memory processing along with effective iterative computations. According to the experimental results, when the proposed model is applied to a classification task, the features selected by the model help to expedite the classification process. The selection of relevant features using the proposed algorithm is on par with existing feature selection algorithms. In conjunction with a random forest classifier, the proposed model produced an average accuracy above 90%, as opposed to 85.6% accuracy when no feature selection strategy was adopted.

[1]  R. S. Bhadoria,et al.  Correction to: Bunch graph based dimensionality reduction using auto-encoder for character recognition , 2022, Multimedia Tools and Applications.

[2]  Maciej Kusy,et al.  A weighted wrapper approach to feature selection , 2021, International Journal of Applied Mathematics and Computer Sciences.

[3]  Hossein Nezamabadi-pour,et al.  Ensemble of feature selection algorithms: a multi-criteria decision-making approach , 2021, International Journal of Machine Learning and Cybernetics.

[4]  Okyay Kaynak,et al.  A Review on Soft Sensors for Monitoring, Control, and Optimization of Industrial Processes , 2021, IEEE Sensors Journal.

[5]  Subhashini Chellappan,et al.  Practical Apache Spark , 2018, Apress.

[6]  Elpiniki I. Papageorgiou,et al.  A risk management model for familial breast cancer: A new application using Fuzzy Cognitive Map method , 2015, Comput. Methods Programs Biomed..

[7]  Verónica Bolón-Canedo,et al.  Recent advances and emerging challenges of feature selection in the context of big data , 2015, Knowl. Based Syst..

[8]  R. Axelrod Structure of decision : the cognitive maps of political elites , 2015 .

[9]  Verónica Bolón-Canedo,et al.  A review of feature selection methods on synthetic data , 2013, Knowledge and Information Systems.

[10]  Thomas Torsney-Weir,et al.  A fuzzy cognitive map of the psychosocial determinants of obesity , 2012, Appl. Soft Comput..

[11]  Marie-Francine Moens,et al.  Highly discriminative statistical features for email classification , 2012, Knowledge and Information Systems.

[12]  Sarit Kraus,et al.  Obtaining scalable and accurate classification in large-scale spatio-temporal domains , 2011, Knowledge and Information Systems.

[13]  Tuomas Eerola,et al.  Generalizability and Simplicity as Criteria in Feature Selection: Application to Mood Classification in Music , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Verónica Bolón-Canedo,et al.  Feature selection and classification in multiple class datasets: An application to KDD Cup 99 dataset , 2011, Expert Syst. Appl..

[15]  Driss Aboutajdine,et al.  A two-stage gene selection scheme utilizing MRMR filter and GA wrapper , 2011, Knowledge and Information Systems.

[16]  Jose L. Salmeron,et al.  Modelling grey uncertainty with Fuzzy Grey Cognitive Maps , 2010, Expert Syst. Appl..

[17]  George C. Runger,et al.  Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination , 2009, J. Mach. Learn. Res..

[18]  Juan Zhang,et al.  An Application of Fuzzy Cognitive Map Based on Active Hebbian Learning Algorithm in Credit Risk Evaluation of Listed Companies , 2009, 2009 International Conference on Artificial Intelligence and Computational Intelligence.

[19]  Yvan Saeys,et al.  Robust Feature Selection Using Ensemble Feature Selection Techniques , 2008, ECML/PKDD.

[20]  C. Ding,et al.  Gene selection algorithm by combining reliefF and mRMR , 2007, 2007 IEEE 7th International Symposium on BioInformatics and BioEngineering.

[21]  Manolis A. Christodoulou,et al.  Fuzzy cognitive network: A general framework , 2007, Intell. Decis. Technol..

[22]  C Scott Findlay,et al.  Integrating conventional science and aboriginal perspectives on diabetes using fuzzy cognitive maps. , 2007, Social science & medicine.

[23]  Andreas S. Andreou,et al.  Soft computing for crisis management and political decision making: the use of genetically evolved fuzzy cognitive maps , 2005, Soft Comput..

[24]  José Aguilar,et al.  Dynamic Random Fuzzy Cognitive Maps , 2004, Computación y Sistemas.

[25]  Huan Liu,et al.  Redundancy based feature selection for microarray data , 2004, KDD.

[26]  George Forman,et al.  An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[27]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[28]  João Paulo Carvalho,et al.  Rule based fuzzy cognitive maps - expressing time in qualitative system dynamics , 2001, 10th IEEE International Conference on Fuzzy Systems. (Cat. No.01CH37297).

[29]  Sovan Samanta,et al.  Prognostic Kalman Filter Based Bayesian Learning Model for Data Accuracy Prediction , 2022, Computers, Materials & Continua.

[30]  Koen Vanhoof,et al.  Rough Cognitive Networks , 2016, Knowl. Based Syst..

[31]  Verónica Bolón-Canedo,et al.  An ensemble of filters and classifiers for microarray data classification , 2012, Pattern Recognit..

[32]  Dimitrios K. Iakovidis,et al.  Intuitionistic Fuzzy Cognitive Maps for Medical Decision Making , 2011, IEEE Transactions on Information Technology in Biomedicine.

[33]  R Kahavi,et al.  Wrapper for feature subset selection , 1997 .