论文信息 - MINING FREQUENT PATTERNS IN DATA STREAM USING ENHANCED SLIDING WINDOW BASED RULE MINING ALGORITHM

MINING FREQUENT PATTERNS IN DATA STREAM USING ENHANCED SLIDING WINDOW BASED RULE MINING ALGORITHM

The association rule mining techniques are used to extract frequent patterns from the transaction set. Attribute frequency is the main requirement for the rule mining process. Static data set based rule mining techniques are normally applied on databases. Static data sets based rule mining techniques are not suitable for data stream based rule mining process. In the stream based rule mining process all the data values are arrived through the data stream from outside data sources. Speed of data stream affects data capturing process and mining process. Lossy and approximate approaches are used for the stream based mining model. Sampling techniques are also used in stream based mining methods. Component based fast mining algorithm uses separate components for data capture and pattern extraction process. But it also fails in high speed data stream communications. The sliding window based approach uses the approximation technique. Data values are processed in sliding window models. Mined rules are maintained in a heap. Top K rules are maintained in the heap. Each rule mining operations are performed on the recent data values only. The accuracy is high in recent data sets. But there is no accuracy is entire data set rule mining process. The component based rule mining model and sliding window model are integrated to compute rule mining in high speed data streams. The data capture component and rule mining component are used for the system. The sliding window model is modified to manage recent data values and frequencies for entire data values. The integrated rule mining system produces rules with more accuracy in minimum period of time. Java language and Oracle back end are selected for the system development.

K. Sangeetha | S. Prakash | S. Ashokumar

[1] Moses Charikar,et al. Finding frequent items in data streams , 2002, Theor. Comput. Sci..

[2] 沈錳坤. An Efficient Algorithm for Mining Frequent Itemsets over the Entire History of Data Streams , 2004 .

[3] Suh-Yin Lee,et al. An Efficient Algorithm for Mining Frequent Itemests over the Entire History of Data Streams , 2004 .

[4] Yossi Matias,et al. New sampling-based summary statistics for improving approximate query answers , 1998, SIGMOD '98.