MapReduce based parallel fuzzy-rough attribute reduction using discernibility matrix

Fuzzy-rough set theory is an efficient method for attribute reduction. It can effectively handle the imprecision and uncertainty of the data in the attribute reduction. Despite its efficacy, current approaches to fuzzy-rough attribute reduction are not efficient for the processing of large data sets due to the requirement of higher space complexities. A limited number of accelerators and parallel/distributed approaches have been proposed for fuzzy-rough attribute reduction in large data sets. However, all of these approaches are dependency measure based methods in which fuzzy similarity matrices are used for performing attribute reduction. Alternative discernibility matrix based attribute reduction methods are found to have less space requirements and more amicable to parallelization in building parallel/distributed algorithms. This paper therefore introduces a fuzzy discernibility matrix-based attribute reduction accelerator (DARA) to accelerate the attribute reduction. DARA is used to build a sequential approach and the corresponding parallel/distributed approach for attribute reduction in large data sets. The proposed approaches are compared to the existing state-of-the-art approaches with a systematic experimental analysis to assess computational efficiency. The experimental study, along with theoretical validation, shows that the proposed approaches are effective and perform better than the current approaches.

[1]  Jianming Zhan,et al.  A novel fuzzy rough set model with fuzzy neighborhood operators , 2021, Inf. Sci..

[2]  Jiandong Wang,et al.  Multigranulation consensus fuzzy-rough based attribute reduction , 2020, Knowl. Based Syst..

[3]  Meikang Qiu,et al.  Distributed Feature Selection for Big Data Using Fuzzy Rough Sets , 2020, IEEE Transactions on Fuzzy Systems.

[4]  Chris Cornelis,et al.  Fuzzy Rough Sets: The Forgotten Step , 2007, IEEE Transactions on Fuzzy Systems.

[5]  Salman Abdul Moiz,et al.  MR_IMQRA: An Efficient MapReduce Based Approach for Fuzzy Decision Reduct Computation , 2019, PReMI.

[6]  Qiang Shen,et al.  Rough set-aided keyword reduction for text categorization , 2001, Appl. Artif. Intell..

[7]  Wei-Zhi Wu,et al.  Maximal-Discernibility-Pair-Based Approach to Attribute Reduction in Fuzzy Rough Sets , 2018, IEEE Transactions on Fuzzy Systems.

[8]  Richard Jensen,et al.  Rough Set-Based Feature Selection: A Review , 2007 .

[9]  Hamido Fujita,et al.  Supervised information granulation strategy for attribute reduction , 2020, Int. J. Mach. Learn. Cybern..

[10]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[11]  Richard Jensen,et al.  Unsupervised fuzzy-rough set-based dimensionality reduction , 2013, Inf. Sci..

[12]  Xiao Zhang,et al.  A fuzzy rough set-based feature selection method using representative instances , 2018, Knowl. Based Syst..

[13]  D. Dubois,et al.  ROUGH FUZZY SETS AND FUZZY ROUGH SETS , 1990 .

[14]  Anna Maria Radzikowska,et al.  A comparative study of fuzzy rough sets , 2002, Fuzzy Sets Syst..

[15]  Witold Pedrycz,et al.  Large-Scale Multimodality Attribute Reduction With Multi-Kernel Fuzzy Rough Sets , 2018, IEEE Transactions on Fuzzy Systems.

[16]  Hamido Fujita,et al.  Attribute group for attribute reduction , 2020, Inf. Sci..

[17]  Wang Ju,et al.  Reduction algorithms based on discernibility matrix: The ordered attributes method , 2001, Journal of Computer Science and Technology.

[18]  Wei-Zhi Wu,et al.  Intuitionistic Fuzzy Rough Set-Based Granular Structures and Attribute Subset Selection , 2019, IEEE Transactions on Fuzzy Systems.

[19]  Jaroslaw Stepaniuk,et al.  Attribute Reduction Based on MapReduce Model and Discernibility Measure , 2016, CISIM.

[20]  Dominik Slezak,et al.  Rough Set Methods for Attribute Clustering and Selection , 2014, Appl. Artif. Intell..

[21]  Xiaoyong Du,et al.  A Novel Approach to Building a Robust Fuzzy Rough Classifier , 2015, IEEE Transactions on Fuzzy Systems.

[22]  Xiaodong Yue,et al.  Parallel attribute reduction algorithms using MapReduce , 2014, Inf. Sci..

[23]  Anil Kumar,et al.  Scalable Fuzzy Rough Set Reduct Computation Using Fuzzy Min–Max Neural Network Preprocessing , 2020, IEEE Transactions on Fuzzy Systems.

[24]  Chris Cornelis,et al.  Attribute selection with fuzzy decision reducts , 2010, Inf. Sci..

[25]  Yiyu Yao,et al.  Discernibility matrix simplification for constructing attribute reducts , 2009, Inf. Sci..

[26]  Yiyu Yao,et al.  Attribute reduction in decision-theoretic rough set models , 2008, Inf. Sci..

[27]  Francisco Herrera,et al.  Implementing algorithms of rough set theory and fuzzy rough set theory in the R package "RoughSets" , 2014, Inf. Sci..

[28]  Richard Jensen,et al.  Towards scalable fuzzy-rough feature selection , 2015, Inf. Sci..

[29]  C. Raghavendra Rao,et al.  An Efficient Approach for Fuzzy Decision Reduct Computation , 2014, Trans. Rough Sets.

[30]  Qinghua Hu,et al.  A Fitting Model for Feature Selection With Fuzzy Rough Sets , 2017, IEEE Transactions on Fuzzy Systems.

[31]  Qinghua Hu,et al.  Information-preserving hybrid data reduction based on fuzzy-rough techniques , 2006, Pattern Recognit. Lett..

[32]  P. S. V. S. Sai Prasad,et al.  MapReduce based improved quick reduct algorithm with granular refinement using vertical partitioning scheme , 2020, Knowl. Based Syst..

[33]  Usman Qamar,et al.  A parallel rough set based dependency calculation method for efficient feature selection , 2017, Appl. Soft Comput..

[34]  Reynold Xin,et al.  Apache Spark , 2016 .

[35]  Jiye Liang,et al.  Fuzzy-rough feature selection accelerator , 2015, Fuzzy Sets Syst..

[36]  Ming-Wen Shao,et al.  Fuzzy rough set-based attribute reduction using distance measures , 2019, Knowl. Based Syst..

[37]  Zdzis?aw Pawlak,et al.  Rough sets , 2005, International Journal of Computer & Information Sciences.

[38]  Yaojin Lin,et al.  A graph approach for fuzzy-rough feature selection , 2020, Fuzzy Sets Syst..

[39]  Qiang Shen,et al.  New Approaches to Fuzzy-Rough Feature Selection , 2009, IEEE Transactions on Fuzzy Systems.

[40]  Sabeur Aridhi,et al.  An experimental survey on big data frameworks , 2016, Future Gener. Comput. Syst..

[41]  Hong Chen,et al.  PARA: A positive-region based attribute reduction accelerator , 2019, Inf. Sci..

[42]  Chris Cornelis,et al.  Fuzzy Rough Sets: from Theory into Practice , 2008, GrC 2008.