Towards MapReduce approach with dynamic fuzzy inference/interpolation for big data classification problems

Currently, big data and its applications have become one of the emergent topics. knowledge can be extracted from a high volume of information by using big data technologies. In practice, MapReduce framework and its different extensions are the most popular approaches for big data. Among the different approaches, those models based on fuzzy systems stand out for many applications. Fuzzy set theory allows for the inclusion of variety and veracity in big data computing problems. However, when a given observation has no overlap with antecedent values, no rule can be invoked in classical fuzzy inference can also appear in big data environment, and therefore no consequence can be derived. Fortunately, fuzzy rule interpolation techniques can support inference in such cases. Combining traditional fuzzy reasoning technique and fuzzy interpolation method may promote the accuracy of inference conclusion. Therefore, in this paper, an initial investigation into the framework of MapReduce with dynamic fuzzy inference/interpolation for big data applications (BigData-DFRI) is reported. The results of an experimental investigation of this method are represented, demonstrating the potential and efficacy of the proposed approach.

[1]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[2]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[3]  Francisco Herrera,et al.  A MapReduce Approach to Address Big Data Classification Problems Based on the Fusion of Linguistic Fuzzy Rules , 2015, Int. J. Comput. Intell. Syst..

[4]  Chuck Lam,et al.  Hadoop in Action , 2010 .

[5]  Paul Zikopoulos,et al.  Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data , 2011 .

[6]  Qiang Shen,et al.  Fuzzy interpolative reasoning via scale and move transformations , 2006, IEEE Transactions on Fuzzy Systems.

[7]  Qiang Shen,et al.  Fuzzy Interpolation and Extrapolation: A Practical Approach , 2008, IEEE Transactions on Fuzzy Systems.

[8]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[9]  Dweepna Garg,et al.  Fuzzy K-mean clustering in MapReduce on cloud based hadoop , 2014, 2014 IEEE International Conference on Advanced Communications, Control and Computing Technologies.

[10]  Francisco Herrera,et al.  An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics , 2013, Inf. Sci..

[11]  Sumit Goswami,et al.  A Fuzzy Based Approach to Text Mining and Document Clustering , 2013, ArXiv.

[12]  D. Sheskin Handbook of Parametric and Nonparametric Statistical Procedures: Third Edition , 2000 .

[13]  Samuel Madden,et al.  From Databases to Big Data , 2012, IEEE Internet Comput..

[14]  Francisco Herrera,et al.  Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data , 2015, Fuzzy Sets Syst..

[15]  Xue-wen Chen,et al.  Combating the Small Sample Class Imbalance Problem Using Feature Selection , 2010, IEEE Transactions on Knowledge and Data Engineering.

[16]  Hiok Chai Quek,et al.  Backward Fuzzy Rule Interpolation , 2014, IEEE Transactions on Fuzzy Systems.