论文信息 - Efficient data preprocessing for genetic-fuzzy mining with MapReduce

Efficient data preprocessing for genetic-fuzzy mining with MapReduce

Genetic-fuzzy data mining can successfully find out linguistic association rules and appropriate membership functions close to human concepts from quantitative transactions, and thus becomes a promising research field in these years. It repeatedly uses fuzzy frequent 1-itemsets to evaluate fitness values of chromosomes, which is very time-consuming. In this paper, we propose a MapReduce preprocessing approach to efficiently transform given quantitative transaction data into pairs of items and quantity lists to increase the performance of genetic-fuzzy mining. The MapReduce architecture totally fits the conversion due to its characteristics of key-value format. Experimental results also show the effect of the proposed approach.

Tzung-Pei Hong | Min-Thai Wu | Chun-Wei Tsai | Yu-Yang Liu

[1] Tzung-Pei Hong,et al. Genetic-Fuzzy Data Mining With Divide-and-Conquer Strategy , 2008, IEEE Transactions on Evolutionary Computation.

[2] Tzung-Pei Hong,et al. Mining Fuzzy Multiple-Level Association Rules from Quantitative Data , 2004, Applied Intelligence.

[3] Tzung-Pei Hong,et al. Mining association rules from quantitative data , 1999, Intell. Data Anal..

[4] Sherif Sakr,et al. The family of mapreduce and large-scale data processing systems , 2013, CSUR.

[5] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[6] Tzung-Pei Hong,et al. A GA-based Fuzzy Mining Approach to Achieve a Trade-off Between Number of Rules and Suitability of Membership Functions , 2006, Soft Comput..

[7] Francisco Herrera,et al. On the combination of genetic fuzzy systems and pairwise learning for improving detection rates on Intrusion Detection Systems , 2015, Expert Syst. Appl..

[8] Tzung-Pei Hong,et al. Fuzzy data mining for interesting generalized association rules , 2003, Fuzzy Sets Syst..

[9] Beng Chin Ooi,et al. Distributed data management using MapReduce , 2014, CSUR.

[10] Tzung-Pei Hong,et al. An effective parallel approach for genetic-fuzzy data mining , 2014, Expert Syst. Appl..

[11] Ramakrishnan Srikant,et al. Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[12] Jorge-Arnulfo Quiané-Ruiz,et al. RAFT at work: speeding-up mapreduce applications under task and node failures , 2011, SIGMOD '11.