Rule Induction in Cascade Model Based on Sum of Squares Decomposition

A cascade model is a rule induction methodology using levelwise expansion of an itemset lattice, where the explanatory power of a rule set and its constituent rules are quantitatively expressed. The sum of squares for a categorical variable has been decomposed to within-group and between-group sum of squares, where the latter provides a good representation of the power concept in a cascade model. Using the model, we can readily derive discrimination and characteristic rules that explain as much of the sum of squares as possible. Plural rule sets are derived from the core to the outskirts of knowledge. The sum of squares criterion can be applied in any rule induction system. The cascade model was implemented as DISCAS. Its algorithms are shown and an applied example is provided for illustration purposes.

[1]  R. Agarwal Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[2]  Abraham Silberschatz,et al.  What Makes Patterns Interesting in Knowledge Discovery Systems , 1996, IEEE Trans. Knowl. Data Eng..

[3]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[4]  Wojciech Ziarko,et al.  The Discovery, Analysis, and Representation of Data Dependencies in Databases , 1991, Knowledge Discovery in Databases.

[5]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[6]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[7]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[8]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[9]  Takashi Okada Sum of Squares Decomposition for Categorical Data , 1999 .

[10]  Masamichi Shimura,et al.  Exceptional Knowledge Discovery in Databases Based on Information Theory , 1996, KDD.

[11]  Kamal Ali,et al.  Partial Classification Using Association Rules , 1997, KDD.

[12]  B. Margolin,et al.  An Analysis of Variance for Categorical Data , 1971 .

[13]  Takashi Okada,et al.  Efficient Detection of Local Interactions in the Cascade Model , 2000, PAKDD.

[14]  Douglas M. Hawkins Topics in Applied Multivariate Analysis , 1982 .

[15]  Xindong Wu,et al.  Research and Development in Knowledge Discovery and Data Mining , 1998, Lecture Notes in Computer Science.

[16]  G. V. Kass,et al.  AUTOMATIC INTERACTION DETECTION , 1982 .

[17]  Takashi Washio,et al.  Mining Association Rules for Estimation and Prediction , 1998, PAKDD.

[18]  Roberto J. Bayardo Brute-Force Mining of High-Confidence Classification Rules , 1997, KDD.