Database Encoding and An Anti-Apriori Algorithm for Association Rules Mining

A method for encoding database is put forward in this paper. By this way, a record is denoted by only one binary number and so the size of the database is reduced sharply. If the database-encoding algorithm is used into some known modified algorithms, the efficiency will be improved remarkably. At the meantime, a new algorithm, anti-Apriori, which based on the proposed encoding method is introduced either. By using some properties of numbers, the itemsets of the database can be transformed into numerical fields. Different from the Apriori algorithm, the new one discovers the association rules from the largest frequent itemset at first, and then all sub itemset, which are also frequent, will be gained without any farther calculation, and all the other small frequent itemset that must be generated in the Apriori be omitted, and the times of the database scan is also reduced. Test results show the new algorithm based on the encoding database has a lower complexity of time and space