Improving Fuzzy Rule Based Classification Systems in Big Data via Support-based Filtering

Fuzzy Rule Based Classification Systems have the benefit of making possible to understand the decision of the classifier. Additionally, they have shown to be robust to solve complex problems. When these capabilities are applied to the context of Big Data, the benefits get multiplied.Therefore, to achieve the highest advantages of Fuzzy Rule Based Classification Systems, the output model must be both interpretable and accurate. The former is achieved by using fuzzy linguistic labels, that are related to human understanding. The latter is achieved by means of robust fuzzy rules, which are identified by means of a component known as "fuzzy rule weight". However, obtaining these rule weights is computationally expensive, resulting on a bottle-neck when applied in Big Data problems.In this work, we propose Chi-BD-SF, which stands for Chi Big Data Support Filtering. It comprises a scalable yet accurate fuzzy rule learning algorithm. It is based on the well-known Chi et al., exchanging the rule weight computation by a support metric in order to solve the conflicts between different consequent rules. In order to show the goodness of this proposal, we analyze several performance metrics, such as the quality of classification, the robustness of the rule base generated and the runtimes of the usage of traditional weights and the support of the rule. The results of our novel Chi-BD-SF approach, in contrast to related Big Data fuzzy classifiers, show that this proposal is able to out-speed the usage of rule weights also obtaining more accurate results.

[1]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[2]  C. L. Philip Chen,et al.  Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[3]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[4]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[5]  Hong Yan,et al.  Fuzzy Algorithms: With Applications to Image Processing and Pattern Recognition , 1996, Advances in Fuzzy Systems - Applications and Theory.

[6]  Hisao Ishibuchi,et al.  Rule weight specification in fuzzy rule-based classification systems , 2005, IEEE Transactions on Fuzzy Systems.

[7]  Patrick Wendell,et al.  Learning Spark: Lightning-Fast Big Data Analytics , 2015 .

[8]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[9]  Francisco Herrera,et al.  Big Data: Tutorial and guidelines on information and process fusion for analytics algorithms with MapReduce , 2018, Inf. Fusion.

[10]  Humberto Bustince,et al.  CHI-BD: A fuzzy rule-based classification system for Big Data classification problems , 2017, Fuzzy Sets Syst..

[11]  María José del Jesús,et al.  Big Data with Cloud Computing: an insight on the computing environment, MapReduce, and programming frameworks , 2014, WIREs Data Mining Knowl. Discov..

[12]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[13]  Chuck Lam,et al.  Hadoop in Action , 2010 .

[14]  F. Herrera,et al.  A proposal on reasoning methods in fuzzy rule-based classification systems , 1999 .

[15]  Francisco Herrera,et al.  Interpretability of linguistic fuzzy rule-based systems: An overview of interpretability measures , 2011, Inf. Sci..

[16]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[17]  Hisao Ishibuchi,et al.  Classification and modeling with linguistic information granules - advanced approaches to linguistic data mining , 2004, Advanced information processing.

[18]  Michael Gleicher,et al.  A Framework for Considering Comprehensibility in Modeling , 2016, Big Data.

[19]  Francisco Herrera,et al.  A MapReduce Approach to Address Big Data Classification Problems Based on the Fusion of Linguistic Fuzzy Rules , 2015, Int. J. Comput. Intell. Syst..