The Web is rich with information. However, the data contained in the web is not well organized which makes obtaining useful information from the Web a difficult task. The successful development of extensible Markup Language (XML) as a standard to represent semi structured data makes the data contained in the Web more readable and the task of mining useful information from the Web becomes feasible. XML has become very popular for representing semistructured data and a standard for data exchange over the Web. Mining XML data from the Web is becoming increasingly important. The previous studies adopt an Apriori-like candidate set generation approach but candidate set generation is still costly. We propose that extracting association rules from XML documents without any preprocessing or postprocessing using XML query language XQuery is possible and analyze the XQuery implementation of the efficient FP-tree based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth. FP-tree based mining adopts a pattern fragment growth method to avoid the costly generation of a large number of candidate sets and a partition-based, divide-and-conquer method is used. Divide-and-conquer method divides the problem into a number of subproblems and the subproblems by solving them recursively. If the subproblem sizes are small enough, however, just solve the subproblems in a straightforward manner and then combine the solutions to the subproblems into the solution for the original problem. In addition, we suggest features that need to be added into XQuery in order to make the implementation of the FP growth more efficient
[1]
Akhil Kumar,et al.
A dynamic warehouse for XML Data of the Web.
,
2001
.
[2]
Tomasz Imielinski,et al.
MSQL: A Query Language for Database Mining
,
1999,
Data Mining and Knowledge Discovery.
[3]
Giuseppe Psaila,et al.
A New SQL-like Operator for Mining Association Rules
,
1996,
VLDB.
[4]
Scott Boag,et al.
XQuery 1.0 : An XML Query Language
,
2007
.
[5]
AgrawalRakesh,et al.
Mining association rules between sets of items in large databases
,
1993
.
[6]
S. Boag,et al.
XQuery 1.0 : An XML query language, W3C Working Draft 12 November 2003
,
2003
.
[7]
Tomasz Imielinski,et al.
Mining association rules between sets of items in large databases
,
1993,
SIGMOD Conference.
[8]
C. M. Sperberg-McQueen,et al.
eXtensible Markup Language (XML) 1.0 (Second Edition)
,
2000
.
[9]
Ramakrishnan Srikant,et al.
Fast Algorithms for Mining Association Rules in Large Databases
,
1994,
VLDB.
[10]
Alessandro Campi,et al.
Mining Association Rules from XML Data
,
2002,
DaWaK.