Mining Frequent Itemsets over Vertical Probabilistic Dataset Using FuzzyUeclat Algorithm

Data mining is the technique of extracting relevant information from large heterogeneous data sources and present in a meaningful and systematic way for users. Nowadays, data mining has become a trend and all computer-related industries are engaged with data mining. Currently, many data mining algorithms focused on computing the frequent pattern from certain and uncertain databases. In this paper, a new algorithm named FuzzyUeclat is being proposed which uses Fuzzy operators to extract frequent items from the vertical probabilistic dataset. Previously, the frequent items were computed on the basis of expected minimum support derived from the existential probability associated with items. Therefore, as the size of the database increases, the value of expected support is decreased at each level of iterations and subsequently, it approaches zero. To avoid the loss of frequent itemset, the expected minimum support is being replaced with Fuzzy min and Fuzzy max operator to compute the support for an itemset. The experimental result shows that the algorithm FuzzyUeclat increases the frequent patterns and improves the performance for the large uncertain databases.

[1]  Junrui Yang,et al.  An Improved Vertical Algorithm for Frequent Itemset Mining from Uncertain Database , 2017, 2017 9th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC).

[2]  Tzung-Pei Hong,et al.  Mining association rules from quantitative data , 1999, Intell. Data Anal..

[3]  L. Zadeh Probability measures of Fuzzy events , 1968 .

[4]  Toon Calders,et al.  Efficient Pattern Mining of Uncertain Data with Sampling , 2010, PAKDD.

[5]  Tzung-Pei Hong,et al.  Fuzzy data mining for interesting generalized association rules , 2003, Fuzzy Sets Syst..

[6]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[7]  Ronald R. Yager Quantifiers in the formulation of multiple objective decision functions , 1983, Inf. Sci..

[8]  L. Zadeh Fuzzy sets as a basis for a theory of possibility , 1999 .

[9]  Toon Calders,et al.  Approximation of Frequentness Probability of Itemsets in Uncertain Data , 2010, 2010 IEEE International Conference on Data Mining.

[10]  Xindong Wu,et al.  Computing the minimum-support for mining frequent patterns , 2008, Knowledge and Information Systems.

[11]  Hans-Peter Kriegel,et al.  Model-based probabilistic frequent itemset mining , 2013, Knowledge and Information Systems.

[12]  Tzung-Pei Hong,et al.  A fuzzy AprioriTid mining algorithm with reduced computational time , 2004, Appl. Soft Comput..

[13]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[14]  Mohammed J. Zaki,et al.  Fast vertical mining using diffsets , 2003, KDD '03.

[15]  Vipin Kumar,et al.  Scalable parallel data mining for association rules , 1997, SIGMOD '97.

[16]  Ben Kao,et al.  A Decremental Approach for Mining Frequent Itemsets from Uncertain Data , 2008, PAKDD.

[17]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[18]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[19]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[20]  Sunil Prabhakar,et al.  Evaluating probabilistic queries over imprecise data , 2003, SIGMOD '03.

[21]  Lotfi A. Zadeh,et al.  Fuzzy probabilities , 1996, Inf. Process. Manag..

[22]  Tzung-Pei Hong,et al.  Finding relevant attributes and membership functions , 1999, Fuzzy Sets Syst..

[23]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[24]  Edward Hung,et al.  Mining Frequent Itemsets from Uncertain Data , 2007, PAKDD.

[25]  Tzung-Pei Hong,et al.  Induction of fuzzy rules and membership functions from training examples , 1996, Fuzzy Sets Syst..

[26]  Zhiqun Deng,et al.  Data Distribution Algorithm Using Time Based Weighted Distributed Hash Tables , 2008, 2008 Seventh International Conference on Grid and Cooperative Computing.

[27]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[28]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[29]  Tzung-Pei Hong,et al.  A fast Algorithm for mining fuzzy frequent itemsets , 2015, J. Intell. Fuzzy Syst..

[30]  Yen-Liang Chen,et al.  Market basket analysis in a multiple store environment , 2005, Decis. Support Syst..

[31]  Feifei Li,et al.  Finding frequent items in probabilistic data , 2008, SIGMOD Conference.

[32]  Philip S. Yu,et al.  Mining Frequent Itemsets over Uncertain Databases , 2012, Proc. VLDB Endow..

[33]  Margaret H. Dunham,et al.  Data Mining: Introductory and Advanced Topics , 2002 .

[34]  Hans-Peter Kriegel,et al.  Probabilistic frequent itemset mining in uncertain databases , 2009, KDD.

[35]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.