Data Mining: From Procedural to Declarative Approaches

Abstract This article provides a viewpoint on the past and possible future development of data mining technology. On an introductory level, it provides some historical background to the development of data mining, sketches its relationship to other disciplines, and introduces a number of tasks that are typically considered data mining tasks. It next focuses on one particular aspect that may play a larger role in data mining, namely, declarativeness. Despite the fact that many different data mining tools have been developed, this variety still offers less flexibility to the user than desired. It also creates a problem of choice: which tool is most suitable for a given problem? Declarative data mining may provide a solution for this. In other domains of computer science, declarative languages have led to major leaps forward in technology. Early results show that in data mining, too, declarative approaches are feasible and may make the process easier, more flexible, more efficient, and more correct.

[1]  Tias Guns,et al.  Integrating Constraint Programming and Itemset Mining , 2010, ECML/PKDD.

[2]  Luc De Raedt,et al.  Logical and Relational Learning: From ILP to MRDM (Cognitive Technologies) , 2008 .

[3]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[4]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[5]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[6]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[7]  Ian Davidson Clustering with Constraints , 2010, Encyclopedia of Machine Learning.

[8]  Maurice Bruynooghe,et al.  Analyzing manuscript traditions using constraint-based data mining , 2012 .

[9]  Thi-Bich-Hanh Dao,et al.  A Declarative Framework for Constrained Clustering , 2013, ECML/PKDD.

[10]  Jan Ramon,et al.  An efficiently computable subgraph pattern support measure: counting independent observations , 2013, Data Mining and Knowledge Discovery.

[11]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[12]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[13]  Luc De Raedt,et al.  Logical and relational learning , 2008, Cognitive Technologies.

[14]  Stephen Muggleton,et al.  Inductive Logic Programming , 2011, Lecture Notes in Computer Science.

[15]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[16]  Celine Vens,et al.  Generalizing from Example Clusters , 2013, Discovery Science.

[17]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[18]  Graham J. Williams,et al.  PMML: An Open Standard for Sharing Models , 2009, R J..

[19]  Hendrik Blockeel,et al.  SCCQL: A constraint-based clustering system , 2013 .

[20]  Kyuseok Shim,et al.  Mining Sequential Patterns with Regular Expression Constraints , 2002, IEEE Trans. Knowl. Data Eng..

[21]  José Luis Balcázar Machine learning and knowledge discovery in databases : European conference, ECML PKDD 2010, Barcelona, Spain, September 20-24, 2010 : proceedings , 2010 .

[22]  Yoshua Bengio,et al.  No Unbiased Estimator of the Variance of K-Fold Cross-Validation , 2003, J. Mach. Learn. Res..

[23]  Maurice Bruynooghe,et al.  Predicate logic as a modeling language: modeling and solving some machine learning and data mining problems with IDP3 , 2013, Theory and Practice of Logic Programming.

[24]  Pan e Panov,et al.  Inductive Databases and Constraint-Based Data Mining , 2010 .

[25]  Giuseppe Psaila,et al.  A New SQL-like Operator for Mining Association Rules , 1996, VLDB.

[26]  Hendrik Blockeel,et al.  SCCQL : A Constraint-Based Clustering System , 2013, ECML/PKDD.

[27]  Heikki Mannila,et al.  A database perspective on knowledge discovery , 1996, CACM.

[28]  Johan Wittocx,et al.  A Prototype of a Knowledge-Based Programming Environment , 2011, INAP/WLP.

[29]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[30]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[31]  Naren Ramakrishnan,et al.  Clustering with Complex Constraints - Algorithms and Applications , 2013, AAAI.

[32]  Hendrik Blockeel,et al.  An inductive database system based on virtual mining views , 2011, Data Mining and Knowledge Discovery.

[33]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[34]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[35]  Gregory Piatetsky-Shapiro,et al.  Knowledge Discovery in Databases: An Overview , 1992, AI Mag..

[36]  Hendrik Blockeel,et al.  A declarative query language for statistical inference , 2013 .

[37]  Luc De Raedt,et al.  MiningZinc: A Modeling Language for Constraint-Based Mining , 2013, IJCAI.

[38]  Tomer Hertz,et al.  Learning a Mahalanobis Metric from Equivalence Constraints , 2005, J. Mach. Learn. Res..

[39]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.