FAR-miner: a fast and efficient algorithm for fuzzy association rule mining

Association rule mining ARM algorithms work only with binary attributes, and expect quantitative attributes to be converted to binary ones using sharp partitions, like 'age = [25, 60]'. A better alternative is to convert quantitative attributes to fuzzy attributes, like 'age = middle-aged', to eliminate loss of information due to sharp partitioning, and then run a fuzzy ARM algorithm. The most popular fuzzy ARM algorithms are fuzzy adaptations of apriori. Fuzzy apriori, like apriori, is a slow algorithm, especially for most medium-sized 500 K to 1 M and large > 1 M datasets. We propose a new fuzzy ARM algorithm called FAR-miner for fast and efficient performance. Through experiments we show that FAR-miner is 8-19 and 6-10 times faster on large and medium-sized datasets respectively as compared to fuzzy apriori. This efficiency is due to properties like two-phased multiple-partition tidlist-style processing and byte-vector representation and effective compression of tidlists.

[1]  Vikram Pudi,et al.  FPrep: Fuzzy clustering driven efficient automated pre-processing for fuzzy association rule mining , 2010, International Conference on Fuzzy Systems.

[2]  Martine De Cock,et al.  Fuzzy versus quantitative association rules: a fair data-driven comparison , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  A. B. M. Shawkat Ali,et al.  Advanced Matrix Algorithm (AMA): reducing number of scans for association rule generation , 2011, Int. J. Bus. Intell. Data Min..

[4]  David Taniar,et al.  Mining Association Rules in Data Warehouses , 2005, Int. J. Data Warehous. Min..

[5]  Wai-Ho Au,et al.  FARM: a data mining system for discovering fuzzy association rules , 1999, FUZZ-IEEE'99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315).

[6]  Sebastián Lozano,et al.  Parallel Fuzzy c-Means Clustering for Large Data Sets , 2002, Euro-Par.

[7]  Vikram Pudi,et al.  Fuzzy association rule mining algorithm for fast and efficient performance on very large datasets , 2009, 2009 IEEE International Conference on Fuzzy Systems.

[8]  Mourad Ykhlef,et al.  Association mining of dependency between time series using Genetic Algorithm and discretisation , 2011, Int. J. Bus. Intell. Data Min..

[9]  Devavrat Shah,et al.  Turbo-charging vertical mining of large databases , 2000, SIGMOD '00.

[10]  Ron Kohavi,et al.  Real world performance of association rule algorithms , 2001, KDD '01.

[11]  Eyke Hüllermeier,et al.  A Note on Quality Measures for Fuzzy Asscociation Rules , 2003, IFSA.

[12]  Chris Cornelis,et al.  Elicitation of fuzzy association rules from positive and negative examples , 2005, Fuzzy Sets Syst..

[13]  M. Sulaiman Khan,et al.  Finding Associations in Composite Data Sets: The CFARM Algorithm , 2011, Int. J. Data Warehous. Min..

[14]  Christian Borgelt,et al.  (Approximate) Frequent Item Set Mining Made Simple with a Split and Merge Algorithm , 2009, Scalable Fuzzy Algorithms for Data Management and Analysis.

[15]  David Taniar,et al.  Exception rules in association rule mining , 2008, Appl. Math. Comput..

[16]  Eyke Hüllermeier,et al.  A systematic approach to the assessment of fuzzy association rules , 2006, Data Mining and Knowledge Discovery.

[17]  Daming Shi,et al.  Mining fuzzy association rules with weighted items , 2000, Smc 2000 conference proceedings. 2000 ieee international conference on systems, man and cybernetics. 'cybernetics evolving to systems, humans, organizations, and their complex interactions' (cat. no.0.

[18]  Kate Smith-Miles,et al.  Redundant association rules reduction techniques , 2007, Int. J. Bus. Intell. Data Min..

[19]  Juggapong Natwichai Privacy preservation for associative classification: an approximation algorithm , 2011, Int. J. Bus. Intell. Data Min..

[20]  Nicolás Marín,et al.  Mining Association Rules from Fuzzy DataCubes , 2009, Scalable Fuzzy Algorithms for Data Management and Analysis.

[21]  David Taniar,et al.  Exception Rules Mining Based on Negative Association Rules , 2004, ICCSA.

[22]  Chris Cornelis,et al.  Fuzzy Association Rules: a Two-Sided Approach , 2003 .

[23]  David Taniar,et al.  ODAM: An optimized distributed association rule mining algorithm , 2004, IEEE Distributed Systems Online.

[24]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[25]  Eyke Hüllermeier,et al.  In Defense of Fuzzy Association Analysis , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[26]  Gillian Dobbie,et al.  Automatic Item Weight Generation for Pattern Mining and its Application , 2011, Int. J. Data Warehous. Min..

[27]  David Taniar,et al.  A Framework for Mining Association Rules in Data Warehouses , 2004, IDEAL.

[28]  Chris Cornelis,et al.  Mining Positive and Negative Fuzzy Association Rules , 2004, KES.