论文信息 - Index Support for Mining Data Streams in a Relational DBMS

Index Support for Mining Data Streams in a Relational DBMS

This paper presents a novel index, called I-Forest, to support data mining activities on data streams, i.e., sequences of incoming data blocks. This approach is appropriate for itemset extraction on evolving datasets such as analysis of transactional data streams from retail chains. The index is a covering structure that represents transaction blocks in a succinct form and allows different kinds of analysis (e.g., analyze quarterly data). During the creation phase no support constraint is enforced, thus the index provides a complete representation of the data stream. The I-Forest index has been implemented into the PostgreSQL open source DBMS and exploits its physical level access methods. Preliminary experiments have been run to validate the proposed approach.

Elena Baralis | Tania Cerquitelli | Silvia Chiusano | Diego Mostile

[1] Ganesh Ramesh,et al. Indexing and Data Access Methods for Database Mining , 2002, DMKD.

[2] Yonatan Aumann,et al. Borders: An Efficient Algorithm for Association Generation in Dynamic Databases , 1999, Journal of Intelligent Information Systems.

[3] Gösta Grahne,et al. Efficiently Using Prefix-trees in Mining Frequent Itemsets , 2003, FIMI.

[4] Jiawei Han,et al. Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[5] Elena Baralis,et al. Index support for frequent itemset mining in a relational DBMS , 2005, 21st International Conference on Data Engineering (ICDE'05).

[6] Andrea Pietracaprina,et al. Mining Frequent Itemsets using Patricia Tries , 2003, FIMI.

[7] Johannes Gehrke,et al. Mining data streams under block evolution , 2002, SKDD.

[8] Osmar R. Zaïane,et al. Incremental mining of frequent patterns without candidate generation or support constraint , 2003, Seventh International Database Engineering and Applications Symposium, 2003. Proceedings..

[9] Jian Pei,et al. Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[10] Osmar R. Zaïane,et al. Inverted matrix: efficient discovery of frequent items in large datasets in the context of interactive mining , 2003, KDD '03.