Efficient data preprocessing for genetic-fuzzy mining with MapReduce

Genetic-fuzzy data mining can successfully find out linguistic association rules and appropriate membership functions close to human concepts from quantitative transactions, and thus becomes a promising research field in these years. It repeatedly uses fuzzy frequent 1-itemsets to evaluate fitness values of chromosomes, which is very time-consuming. In this paper, we propose a MapReduce preprocessing approach to efficiently transform given quantitative transaction data into pairs of items and quantity lists to increase the performance of genetic-fuzzy mining. The MapReduce architecture totally fits the conversion due to its characteristics of key-value format. Experimental results also show the effect of the proposed approach.

[1]  Tzung-Pei Hong,et al.  Genetic-Fuzzy Data Mining With Divide-and-Conquer Strategy , 2008, IEEE Transactions on Evolutionary Computation.

[2]  Tzung-Pei Hong,et al.  Mining Fuzzy Multiple-Level Association Rules from Quantitative Data , 2004, Applied Intelligence.

[3]  Tzung-Pei Hong,et al.  Mining association rules from quantitative data , 1999, Intell. Data Anal..

[4]  Sherif Sakr,et al.  The family of mapreduce and large-scale data processing systems , 2013, CSUR.

[5]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[6]  Tzung-Pei Hong,et al.  A GA-based Fuzzy Mining Approach to Achieve a Trade-off Between Number of Rules and Suitability of Membership Functions , 2006, Soft Comput..

[7]  Francisco Herrera,et al.  On the combination of genetic fuzzy systems and pairwise learning for improving detection rates on Intrusion Detection Systems , 2015, Expert Syst. Appl..

[8]  Tzung-Pei Hong,et al.  Fuzzy data mining for interesting generalized association rules , 2003, Fuzzy Sets Syst..

[9]  Beng Chin Ooi,et al.  Distributed data management using MapReduce , 2014, CSUR.

[10]  Tzung-Pei Hong,et al.  An effective parallel approach for genetic-fuzzy data mining , 2014, Expert Syst. Appl..

[11]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[12]  Jorge-Arnulfo Quiané-Ruiz,et al.  RAFT at work: speeding-up mapreduce applications under task and node failures , 2011, SIGMOD '11.