Towards a Logic Query Language for Data Mining

We present a logic database language with elementary data mining mechanisms to model the relevant aspects of knowledge discovery, and to provide a support for both the iterative and interactive features of the knowledge discovery process. We adopt the notion of user-defined aggregate to model typical data mining tasks as operations unveiling unseen knowledge. We illustrate the use of aggregates to model specific data mining tasks, such as frequent pattern discovery, classification, data discretization and clustering, and show how the resulting data mining query language allows the modeling of typical steps of the knowledge discovery process, that range from data preparation to knowledge extraction and evaluation.

[1]  Michael Kifer,et al.  Deductive and Object-Oriented Databases , 1991 .

[2]  Christos Faloutsos,et al.  Advanced Database Systems , 1997, Lecture Notes in Computer Science.

[3]  Randy Kerber,et al.  ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[4]  Carlo Zaniolo,et al.  Negation and Aggregates in Recursive Rules: the LDL++ Approach , 1993, DOOD.

[5]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[6]  Carlo Zaniolo,et al.  Metaqueries for Data Mining , 1996, Advances in Knowledge Discovery and Data Mining.

[7]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[8]  Heikki Mannila,et al.  A database perspective on knowledge discovery , 1996, CACM.

[9]  Heikki Mannila,et al.  Inductive Databases and Condensed Representations for Data Mining , 1997, ILPS.

[10]  Chris Clifton,et al.  Query flocks: a generalization of association-rule mining , 1998, SIGMOD '98.

[11]  Jean-François Boulicaut,et al.  Querying Inductive Databases: A Case Study on the MINE RULE Operator , 1998, PKDD.

[12]  Surajit Chaudhuri,et al.  On the Efficient Gathering of Sufficient Statistics for Classification from Large SQL Databases , 1998, KDD.

[13]  Giuseppe Psaila,et al.  A tightly-coupled architecture for data mining , 1998, Proceedings 14th International Conference on Data Engineering.

[14]  Franco Turini,et al.  Experiences with a Logic-based knowledge discovery Support Environment , 1999, 1999 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[15]  Carlo Zaniolo,et al.  Logic-Based User-Defined Aggregates for the Next Generation of Database Systems , 1999, The Logic Programming Paradigm.

[16]  Luc De Raedt,et al.  A Perspective on Inductive Logic Programming , 1999, The Logic Programming Paradigm.

[17]  Evelina Lamma,et al.  AI*IA 99: Advances in Artificial Intelligence , 2000, Lecture Notes in Computer Science.

[18]  Fosca Giannotti,et al.  Declarative Knowledge Extraction with Interactive User-Defined Aggregates , 2000, FQAS.

[19]  Fosca Giannotti,et al.  Making Knowledge Extraction and Reasoning Closer , 2000, PAKDD.

[20]  Dino Pedreschi,et al.  Nondeterministic, Nonmonotonic Logic Databases , 2001, IEEE Trans. Knowl. Data Eng..

[21]  Jan Komorowski,et al.  Principles of Data Mining and Knowledge Discovery , 2001, Lecture Notes in Computer Science.

[22]  Franco Turini,et al.  Specifying Mining Algorithms with Iterative User-Defined Aggregates: A Case Study , 2001, PKDD.

[23]  Carlo Zaniolo,et al.  Semantics and Expressive Power of Nondeterministic Constructs in Deductive Databases , 2001, J. Comput. Syst. Sci..

[24]  Luc De Raedt,et al.  Data Mining as Constraint Logic Programming , 2002, Computational Logic: Logic Programming and Beyond.

[25]  Huan Liu,et al.  Discretization: An Enabling Technique , 2002, Data Mining and Knowledge Discovery.

[26]  Heikki Mannila,et al.  Levelwise Search and Borders of Theories in Knowledge Discovery , 1997, Data Mining and Knowledge Discovery.

[27]  Tomasz Imielinski,et al.  MSQL: A Query Language for Database Mining , 1999, Data Mining and Knowledge Discovery.