A Framework for Mining and Querying Summarized XML Data through Tree-Based Association Rules

The massive amount of datasets expressed in different formats, such as relational, XML, and RDF, available in several real applications, may cause some difficulties to non-expert users trying to access these datasets without having sufficient knowledge on their content and structure. Moreover, the processes of query composition, especially in the absence of a schema, and interpretation of the obtained answers may be non-trivial. Data mining techniques, already widely applied to extract frequent correlations of values from both structured and semistructured datasets, provide several interesting solutions for knowledge elicitation. However, the mining process is often guided by the designer, who determines the portion of a dataset where useful patterns can be extracted based on his/her deep knowledge of the application scenario. In our opinion, a research challenge is to mine hidden information from huge datasets, and then use it order to gain useful knowledge. DOI: 10.4018/978-1-61350-356-0.ch012

[1]  Zhigang Li,et al.  Efficient data mining for maximal frequent subtrees , 2003, Third IEEE International Conference on Data Mining.

[2]  Vladimir Nikulin Classification of Imbalanced Data with Random sets and Mean-Variance Filtering , 2008, Int. J. Data Warehous. Min..

[3]  Hans Weigand,et al.  An XML-Enabled Association Rule Framework , 2003, DEXA.

[4]  Gillian Dobbie,et al.  Extracting association rules from XML documents using XQuery , 2003, WIDM '03.

[5]  Donald D. Chamberlin XQuery: An XML query language , 2002, IBM Syst. J..

[6]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[7]  Ke Wang,et al.  Discovering Structural Association of Semistructured Data , 2000, IEEE Trans. Knowl. Data Eng..

[8]  Hiroki Arimura,et al.  Efficient Substructure Discovery from Large Semi-Structured Data , 2001, IEICE Trans. Inf. Syst..

[9]  Alexandre Termier,et al.  Dryade: a new approach for discovering closed frequent trees in heterogeneous tree databases , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[10]  Letizia Tanca,et al.  Mining Tree-Based Frequent Patterns from XML , 2009, FQAS.

[11]  Yun Chi,et al.  Frequent Subtree Mining - An Overview , 2004, Fundam. Informaticae.

[12]  Fernando Berzal Galiano,et al.  Mining Induced and Embedded Subtrees in Ordered, Unordered, and Partially-Ordered Trees , 2008, ISMIS.

[13]  Kyuseok Shim,et al.  APEX: an adaptive path index for XML data , 2002, SIGMOD '02.

[14]  Torsten Grust,et al.  MonetDB/XQuery: a fast XQuery processor powered by a relational engine , 2006, SIGMOD Conference.

[15]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[16]  DANIELE BRAGA,et al.  XQBE (XQuery By Example): A visual interface to the standard XML query language , 2005, TODS.

[17]  Weiru Chen,et al.  Sequential Patterns Postprocessing for Structural Relation Patterns Mining , 2010, Strategic Advancements in Utilizing Data Mining and Warehousing Technologies.

[18]  Yun Chi,et al.  CMTreeMiner: Mining Both Closed and Maximal Frequent Subtrees , 2004, PAKDD.

[19]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[20]  Hee Yong Youn,et al.  A New Method for Mining Association Rules from a Collection of XML Documents , 2005, ICCSA.

[21]  Carlo Combi,et al.  Querying XML documents by using association rules , 2005, 16th International Workshop on Database and Expert Systems Applications (DEXA'05).

[22]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[23]  Alessandro Campi,et al.  Mining Association Rules from XML Data , 2002, DaWaK.

[24]  Hao He,et al.  Multiresolution indexing of XML for frequent queries , 2004, Proceedings. 20th International Conference on Data Engineering.

[25]  Mohammed J. Zaki Efficiently mining frequent trees in a forest: algorithms and applications , 2005, IEEE Transactions on Knowledge and Data Engineering.

[26]  Alexandre Termier,et al.  DryadeParent, An Efficient and Robust Closed Attribute Tree Mining Algorithm , 2008, IEEE Transactions on Knowledge and Data Engineering.

[27]  Bart Goethals,et al.  Advances in frequent itemset mining implementations: report on FIMI'03 , 2004, SKDD.

[28]  John Zeleznikow,et al.  Relational computation for mining association rules from XML data , 2005, CIKM '05.

[29]  Vishal Bhatnagar,et al.  Data Mining in Dynamic Social Networks and Fuzzy Systems , 2013 .

[30]  Andrew Lim,et al.  D(k)-index: an adaptive structural summary for graph-structured data , 2003, SIGMOD '03.