Parallel Implantation of Frequent Itemset Mining Using Inverted Matrix Based on OpenCL

Extracting knowledge in the form of frequent itemsets and association rules deserves great importance in the field of data mining. Apriori algorithm suffers from multiple scans of the database and thus forms high memory dependency. On the other hand frequent pattern tree (FP tree) growth algorithm becomes impractical for large databases due to memory-based data structure. An efficient approach of inverted matrix with COFI (co-occurrence frequent item) tree alleviates disadvantages of both the above-mentioned algorithms. For massively large computations, modern GPUs provide a large set of parallel processors which facilitate in general-purpose computing. General purpose graphical processing unit (GPGPU) is way of utilizing the existing GPU for general purpose use. Open computing language (OpenCL) provides a standard for cross-platform programming on modern processors such as many-core CPUs and GPUs. As inverted matrix approach is advantageous over other algorithms, it is desirable to form it parallel to OpenCL. We have proposed a new technique called CLInverted matrix itemset mining, which is an advancement over existing techniques and contributes to load sharing. The proposed architecture in this paper highlights the inverted matrix approach implantation based on OpenCL framework. In experiments we have compared the results of serial and parallel versions of the proposed approach on various OpenCL devices.

[1]  Osmar R. Zaïane,et al.  Parallel association rule mining with minimum inter-processor communication , 2003, 14th International Workshop on Database and Expert Systems Applications, 2003. Proceedings..

[2]  Osmar R. Zaïane,et al.  Inverted matrix: efficient discovery of frequent items in large datasets in the context of interactive mining , 2003, KDD '03.

[3]  Osmar R. Zaïane,et al.  COFI-tree Mining: A New Approach to Pattern Growth with Reduced Candidacy Generation , 2003, FIMI.

[4]  Osmar R. Zaïane,et al.  Fast parallel association rule mining without candidacy generation , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[5]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[6]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[7]  Osvaldo Gervasi,et al.  The AES Implantation Based on OpenCL for Multi/many Core Architecture , 2010, 2010 International Conference on Computational Science and Its Applications.

[8]  Sanjay Garg,et al.  Parallel frequent set mining using inverted matrix approach , 2012, 2012 Nirma University International Conference on Engineering (NUiCONE).

[9]  Kristofer Schlachter,et al.  An Introduction to the OpenCL Programming Model , 2012 .

[10]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.