The important goal in data mining is to reveal hidden knowledge from data and various algorithms have been proposed so far. But the problem is that typically not all rules are interesting - only small fractions of the generated rules would be of interest to any given user. Hence, numerous measures such as confidence, support, lift, information gain, and so on, have been proposed to determine the best or most interesting rules. However, some algorithms are good at generating rules high in one interestingness measure but bad in other interestingness measures. The relationship between the algorithms and interestingness measures of the generated rules is not clear yet. In this paper, we studied the relationship between the algorithms and interesting measures. We used synthetic data so that the obtained result is not limited to specific cases. We report our experimental results and present the best combination between algorithms and parameters in order to generate interesting rules.
[1]
Alberto Maria Segre,et al.
Programs for Machine Learning
,
1994
.
[2]
Howard J. Hamilton,et al.
Choosing the Right Lens: Finding What is Interesting in Data Mining
,
2007,
Quality Measures in Data Mining.
[3]
Kalina Yacef,et al.
Interestingness Measures for Associations Rules in Educational Data
,
2008,
EDM.
[4]
Abraham Silberschatz,et al.
On Subjective Measures of Interestingness in Knowledge Discovery
,
1995,
KDD.
[5]
Howard J. Hamilton,et al.
Interestingness measures for data mining: A survey
,
2006,
CSUR.
[6]
Tomasz Imielinski,et al.
Mining association rules between sets of items in large databases
,
1993,
SIGMOD Conference.
[7]
Tobias Scheffer,et al.
Finding association rules that trade support optimally against confidence
,
2001,
Intell. Data Anal..