Speeding up frequent itemset mining process on XML data using graphic processor

XML technology is being extensively used for data exchange between applications on web and hence mining these documents becomes an important area of research. Since XML is extensively used in web, efficient methods are required for knowledge discovery from the enormous collections of XML documents. Also some advanced tools and technologies are required to effectively handle this scalable data. A methodology is proposed to deal with handling such scalable XML data with the help of high performance and low cost computing, the GPU. This paper aims to parallelize the pre-processing stage of deserialization and sorting to make the dataset favorable for mining.

[1]  Letizia Tanca,et al.  Mining tree-based association rules from XML documents , 2009, SEBD.

[2]  Wim Martens,et al.  Querying graph databases with XPath , 2013, ICDT '13.

[3]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[4]  Jixue Liu,et al.  On mining association rules with semantic constraints in XML , 2011, 2011 Sixth International Conference on Digital Information Management.

[5]  T. Amudha,et al.  An Improved Association Rule Mining Technique for Xml Data Using Xquery and Apriori Algorithm , 2009, 2009 IEEE International Advance Computing Conference.

[6]  Juryon Paik,et al.  Mining Association Rules from a Collection of XML Documents using Cross Filtering Algorithm , 2006, 2006 International Conference on Hybrid Information Technology.

[7]  Xinwei Wang,et al.  Mining Association Rules from Complex and Irregular XML Documents Using XSLT and Xquery , 2008, 2008 International Conference on Advanced Language Processing and Web Information Technology.

[8]  Alessandro Campi,et al.  Mining Association Rules from XML Data , 2002, DaWaK.

[9]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[10]  Hamidah Ibrahim,et al.  Mining association rules from structured XML data , 2009, 2009 International Conference on Electrical Engineering and Informatics.