论文信息 - Experiment Databases: A Novel Methodology for Experimental Research

Experiment Databases: A Novel Methodology for Experimental Research

Data mining and machine learning are experimental sciences: a lot of insight in the behaviour of algorithms is obtained by implementing them and studying how they behave when run on datasets. However, such experiments are often not as extensive and systematic as they ideally would be, and therefore the experimental results must be interpreted with caution. In this paper we present a new experimental methodology that is based on the concept of “experiment databases”. An experiment database can be seen as a special kind of inductive database, and the experimental methodology consists of filling and then querying this database. We show that the novel methodology has numerous advantages over the existing one. As such, this paper presents a novel and interesting application of inductive databases that may have a significant impact on experimental research in machine learning and data mining.

Hendrik Blockeel

[1] Peter A. Flach,et al. Improved Dataset Characterisation for Meta-learning , 2002, Discovery Science.

[2] Hilan Bensusan,et al. Meta-Learning by Landmarking Various Learning Algorithms , 2000, ICML.

[3] Giuseppe Psaila,et al. An Extension to SQL for Mining Association Rules , 1998, Data Mining and Knowledge Discovery.

[4] Luc De Raedt,et al. A perspective on inductive databases , 2002, SKDD.

[5] Jeffrey S. Simonoff,et al. Tree Induction Vs Logistic Regression: A Learning Curve Analysis , 2001, J. Mach. Learn. Res..