论文信息 - A Cumulative Learning Approach to Data Mining Employing Censored Production Rules (CPRs)

A Cumulative Learning Approach to Data Mining Employing Censored Production Rules (CPRs)

Knowledge is indispensable but voluminous knowledge becomes a bottleneck for efficient processing. A great challenge for data mining activity is the generation of large number of potential rules as a result of mining process. In fact sometimes result size is comparable to the original data. Traditional data mining pruning activities such as support do not sufficiently reduce the huge rule space. Moreover, many practical applications are characterized by continual change of data and knowledge, thereby making knowledge voluminous with each change. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. Michalski & Winston proposed Censored Production Rules (CPRs), as an extension of production rules, that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations in which the conditional statement ‘If P Then D’ holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence, are tight or there is simply no information available as to whether it holds or not. Thus the ‘If P Then D’ part of the CPR expresses important information while the Unless C part acts only as a switch changes the polarity of D to ~D. In this paper a scheme based on Dempster-Shafer Theory (DST) interpretation of a CPR is suggested for discovering CPRs from the discovered flat PRs. The discovery of CPRs from flat rules would result in considerable reduction of the already discovered rules. The proposed scheme incrementally incorporates new knowledge and also reduces the size of knowledge base considerably with each episode. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested cumulative learning scheme would be useful in mining data streams. Keywords—Censored production rules, cumulative learning, data mining, machine learning.

Kamal Kant Bharadwaj | Rekha Kandwal

[1] Kamal Kant Bharadwaj,et al. Hierarchical Censored Production Rules (HCPRs) system , 1992, Data Knowl. Eng..

[2] Petra Perner,et al. Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[3] Pieter Adriaans,et al. Data mining , 1996 .

[4] Kamal Kant Bharadwaj,et al. Some learning techniques in hierarchical censored production rules (HCPRs) system , 1998, Int. J. Intell. Syst..

[5] Christos Faloutsos,et al. Automated Learning and Discovery State-of-the-Art and Research Topics in a Rapidly Growing Field , 1999, AI Mag..

[6] Jennifer Widom,et al. Models and issues in data stream systems , 2002, PODS.

[7] Philip S. Yu,et al. Online Mining of Changes from Data Streams: Research Problems and Preliminary Results , 2003 .

[8] Karl Rihaczek,et al. 1. WHAT IS DATA MINING? , 2019, Data Mining for the Social Sciences.

[9] Ryszard S. Michalski,et al. Variable Precision Logic , 1986, Artif. Intell..

[10] Nikola Kasabov,et al. Foundations Of Neural Networks, Fuzzy Systems, And Knowledge Engineering [Books in Brief] , 1996, IEEE Transactions on Neural Networks.

[11] Wynne Hsu,et al. Intuitive Representation of Decision Trees Using General Rules and Exceptions , 2000, AAAI/IAAI.

[12] Vijay V. Raghavan,et al. Dynamic Data Mining , 2000, IEA/AIE.