论文信息 - A Metric for Selection of the Most Promising Rules

A Metric for Selection of the Most Promising Rules

The process of Knowledge Discovery in Databases pursues the goal of extracting useful knowledge from large amounts of data. It comprises a pre-processing step, application of a data-mining algorithm and post-processing of results. When rule induction is applied for data-mining one must be prepared to deal with the generation of a large number of rules. In these circumstances it is important to have a way of selecting the rules that have the highest predictive power. We propose a metric for selection of the n rules with the highest average distance between them. We defend that applying our metric to select the rules that are more distant improves the system prediction capabilities against other criteria for rule selection. We present an application example and empirical results produced from a synthesized data set on a financial domain.

Carlos Bento | Pedro Gago | C. Bento | P. Gago | Pedro Gago

[1] William Frawley,et al. Knowledge Discovery in Databases , 1991 .

[2] Gregory Piatetsky-Shapiro,et al. Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[3] Gregory Piatetsky-Shapiro,et al. The interestingness of deviations , 1994 .

[4] Heikki Mannila,et al. Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[5] Rajjan Shinghal,et al. Evaluating the Interestingness of Characteristic Rules , 1996, KDD.

[6] Abraham Silberschatz,et al. What Makes Patterns Interesting in Knowledge Discovery Systems , 1996, IEEE Trans. Knowl. Data Eng..

[7] Ernesto Costa,et al. Towards a Case-Based Model for Creative Processes , 1996, ECAI.

[8] Ramakrishnan Srikant,et al. Mining generalized association rules , 1995, Future Gener. Comput. Syst..