Database integration of mining is becoming increasingly important with the installation of larger and larger data warehouses built around relational database technology. Most of the commercially available mining systems integrate loosely (typically, through an ODBC or SQL cursor interface) with data stored in DBMSs. In cases where the mining algorithm makes multiple passes over the data, it is also possible to cache the data in flat files rather than retrieve multiple times from the DBMS, to achieve better performance. Recent studies have found that for association rule mining, with carefully tuned SQL formulations it is possible to achieve performance comparable to systems that cache the data in files outside the DBMS. The SOL implementation has potential for offering other qualitative advantages like automatic parallelization, development ease, portability and inter-operability with relational operators. In this paper, we present several alternatives for formulating as SQL queries association rule generalized to handle items with hierarchies on them and sequential pattern mining. This work illustrates that it is possible to express computations that are significantly more complicated than simple boolean associations, in SQL using essentially the same framework.
[1]
Heikki Mannila,et al.
Fast Discovery of Association Rules
,
1996,
Advances in Knowledge Discovery and Data Mining.
[2]
PiraheshHamid,et al.
Extensions to Starburst
,
1991
.
[3]
Kyuseok Shim,et al.
Developing Tightly-Coupled Data Mining Applications on a Relational Database System
,
1996,
KDD.
[4]
Sunita Sarawagi,et al.
Integrating association rule mining with relational database systems: alternatives and implications
,
1998,
SIGMOD '98.
[5]
Ramakrishnan Srikant,et al.
Mining generalized association rules
,
1995,
Future Gener. Comput. Syst..
[6]
Donald D. Chamberlin,et al.
Using the New DB2: IBM's Object-Relational Database System
,
1996
.
[7]
Ramakrishnan Srikant,et al.
Mining Sequential Patterns: Generalizations and Performance Improvements
,
1996,
EDBT.
[8]
Hamid Pirahesh,et al.
Extensions to Starburst: objects, types, functions, and rules
,
1991,
CACM.