Data mining in deductive databases using query flocks

Data mining can be defined as a process for finding trends and patterns in large data. An important technique for extracting useful information, such as regularities, from usually historical data, is called as association rule mining. Most research on data mining is concentrated on traditional relational data model. On the other hand, the query flocks technique, which extends the concept of association rule mining with a 'generate-and-test' model for different kind of patterns, can also be applied to deductive databases. In this paper, query flocks technique is extended with view definitions including recursive views. Although in our system query flock technique can be applied to a data base schema including both the intensional data base (IDB) or rules and the extensible data base (EDB) or tabled relations, we have designed an architecture to compile query flocks from datalog into SQL in order to be able to use commercially available data base management systems (DBMS) as an underlying engine of our system. However, since recursive datalog views (IDB's) cannot be converted directly into SQL statements, they are materialized before the final compilation operation. On this architecture, optimizations suitable for the extended query flocks are also introduced. Using the prototype system, which is developed on a commercial database environment, advantages of the new architecture together with the optimizations, are also presented.

[1]  Sunita Sarawagi,et al.  Integrating association rule mining with relational database systems: alternatives and implications , 1998, SIGMOD '98.

[2]  Ismail Hakki Toroslu,et al.  Data Mining Using Query Flocks with Views , 2000, DEXA.

[3]  Raghu Ramakrishnan,et al.  Database Management Systems , 1976 .

[4]  Chris Clifton,et al.  Query flocks: a generalization of association-rule mining , 1998, SIGMOD '98.

[5]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[6]  Gregory Piatetsky-Shapiro,et al.  Advances in Knowledge Discovery and Data Mining , 2004, Lecture Notes in Computer Science.

[7]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[8]  Chris Clifton,et al.  TopCat: data mining for topic identification in a text corpus , 1999, IEEE Transactions on Knowledge and Data Engineering.

[9]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[10]  Jeffrey D. Uuman Principles of database and knowledge- base systems , 1989 .

[11]  Yang Yang,et al.  CRD: A New Data Mining Method in Deductive Databases , 1997, DDLP.

[12]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[13]  Shalom Tsur,et al.  Integrating Data Mining with Relational DBMS: A Tightly-Coupled Approach , 1999, NGITS.

[14]  Jeffrey D. Ullman,et al.  Principles of Database and Knowledge-Base Systems, Volume II , 1988, Principles of computer science series.

[15]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[16]  Jeffrey D. Ullman,et al.  Data mining techniques for structured and semistructured data , 2000 .

[17]  Lawrence J. Henschen,et al.  On compiling queries in recursive first-order databases , 1984, JACM.

[18]  Koichi Furukawa,et al.  Query Evaluation of Deductive Database by MGTP and its Application to Data Mining , 1997, DDLP.

[19]  Lucian Russell Deductive Data Mining: Uncertainty Measures for Banding the Search Space , 1998, KRDB.