A method for encoding database is put forward in this paper. By this way, a record is denoted by only one binary number and so the size of the database is reduced sharply. If the database-encoding algorithm is used into some known modified algorithms, the efficiency will be improved remarkably. At the meantime, a new algorithm, anti-Apriori, which based on the proposed encoding method is introduced either. By using some properties of numbers, the itemsets of the database can be transformed into numerical fields. Different from the Apriori algorithm, the new one discovers the association rules from the largest frequent itemset at first, and then all sub itemset, which are also frequent, will be gained without any farther calculation, and all the other small frequent itemset that must be generated in the Apriori be omitted, and the times of the database scan is also reduced. Test results show the new algorithm based on the encoding database has a lower complexity of time and space
[1]
Heikki Mannila,et al.
Verkamo: Fast Discovery of Association Rules
,
1996,
KDD 1996.
[2]
Heikki Mannila,et al.
Fast Discovery of Association Rules
,
1996,
Advances in Knowledge Discovery and Data Mining.
[3]
Mohammed J. Zaki.
Scalable Algorithms for Association Mining
,
2000,
IEEE Trans. Knowl. Data Eng..
[4]
Tomasz Imielinski,et al.
Mining association rules between sets of items in large databases
,
1993,
SIGMOD Conference.
[5]
Catherine Blake,et al.
UCI Repository of machine learning databases
,
1998
.
[6]
Margaret H. Dunham,et al.
Data Mining: Introductory and Advanced Topics
,
2002
.