Inductive Querying with Virtual Mining Views

In an inductive database, one can not only query the data stored in the database, but also the patterns that are implicitly present in these data. In this chapter, we present an inductive database system in which the query language is traditional SQL. More specifically, we present a system in which the user can query the collection of all possible patterns as if they were stored in traditional relational tables. We show how such tables, or mining views, can be developed for three popular data mining tasks, namely itemset mining, association rule discovery and decision tree learning. To illustrate the interactive and iterative capabilities of our system, we describe a complete data mining scenario that consists in extracting knowledge from real gene expression data, after a pre-processing phase.

[1]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[2]  Jean-François Boulicaut,et al.  Query Languages Supporting Descriptive Rule Mining: A Comparative Study , 2004, Database Support for Data Mining Applications.

[3]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[4]  Zhaohui Tang,et al.  Data Mining with SQL Server 2005 , 2005 .

[5]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[6]  Toon Calders,et al.  Integrating Pattern Mining in Relational Databases , 2006, PKDD.

[7]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[8]  Hendrik Blockeel,et al.  Integrating Decision Tree Learning into Inductive Databases , 2006, KDID.

[9]  J. Derisi,et al.  The Transcriptome of the Intraerythrocytic Developmental Cycle of Plasmodium falciparum , 2003, PLoS biology.

[10]  Giuseppe Psaila,et al.  An Extension to SQL for Mining Association Rules , 1998, Data Mining and Knowledge Discovery.

[11]  M. Eisen,et al.  Why PLoS Became a Publisher , 2003, PLoS biology.

[12]  Heikki Mannila,et al.  Levelwise Search and Borders of Theories in Knowledge Discovery , 1997, Data Mining and Knowledge Discovery.

[13]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[14]  Hamid Pirahesh,et al.  Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals , 1996, Data Mining and Knowledge Discovery.

[15]  Laks V. S. Lakshmanan,et al.  Mining frequent itemsets with convertible constraints , 2001, Proceedings 17th International Conference on Data Engineering.

[16]  Gang Liu,et al.  DBMiner: a system for data mining in relational databases and data warehouses , 1997, CASCON.

[17]  Tomasz Imielinski,et al.  MSQL: A Query Language for Database Mining , 1999, Data Mining and Knowledge Discovery.

[18]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[19]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[20]  Wei Wang,et al.  DMQL: A Data Mining Query Language for Relational Databases , 2007 .

[21]  Stefan Kramer,et al.  SINDBAD and SiQL: An Inductive Database and Query Language in the Relational Model , 2008, ECML/PKDD.

[22]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[23]  C. Becquet,et al.  Strong-association-rule mining for large-scale gene-expression data analysis: a case study on human SAGE data , 2002, Genome Biology.

[24]  Heikki Mannila,et al.  A database perspective on knowledge discovery , 1996, CACM.

[25]  Giuseppe Psaila,et al.  A tightly-coupled architecture for data mining , 1998, Proceedings 14th International Conference on Data Engineering.

[26]  Hendrik Blockeel,et al.  An inductive database prototype based on virtual mining views , 2008, KDD.

[27]  Stefano Bistarelli,et al.  Interestingness is Not a Dichotomy: Introducing Softness in Constrained Pattern Mining , 2005, PKDD.

[28]  Toon Calders,et al.  Mining Views: Database Views for Data Mining , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[29]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[30]  Salvatore Orlando,et al.  ConQueSt: a Constraint-based Querying System for Exploratory Pattern Discovery , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[31]  Hendrik Blockeel,et al.  An inductive database system based on virtual mining views , 2011, Data Mining and Knowledge Discovery.