Propositionalisation of Continuous Attributes beyond Simple Aggregation

Existing propositionalisation approaches mainly deal with categorical attributes. Few approaches deal with continuous attributes. A first solution is then to discretise numeric attributes to transform them into categorical ones. Alternative approaches dealing with numeric attributes consist in aggregating them with simple functions such as average, minimum, maximum, etc. We propose an approach dual to discretisation that reverses the processing of objects and thresholds, and whose discretisation corresponds to quantiles. Our approach is evaluated thoroughly on artificial data to characterize its behaviour with respect to two attribute-value learners, and on real datasets.

[1]  Marco Botta,et al.  Refining Numerical Constants in First Order Logic Theories , 2004, Machine Learning.

[2]  Stefan Kramer,et al.  A Numerical Refinement Operator Based on Multi-Instance Learning , 2010, ILP.

[3]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[4]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[5]  Ondrej Kuzelka,et al.  Block-wise construction of acyclic relational features with monotone irreducibility and relevancy properties , 2009, ICML '09.

[6]  Nicolas Lachiche,et al.  A Platform for Spatial Data Labeling in an Urban Context , 2009, OGRS.

[7]  Ashwin Srinivasan,et al.  BET : An Inductive Logic Programming Workbench , 2010, ILP.

[8]  Nada Lavrac,et al.  Propositionalization-based relational subgroup discovery with RSD , 2006, Machine Learning.

[9]  Nicolas Lachiche,et al.  Classification et évolution des tissus urbains à partir de données vectorielles , 2011, Rev. Int. Géomatique.

[10]  S. Džeroski,et al.  Relational Data Mining , 2001, Springer Berlin Heidelberg.

[11]  Stephen Muggleton,et al.  Inverse entailment and progol , 1995, New Generation Computing.

[12]  Hendrik Blockeel,et al.  Top-Down Induction of First Order Logical Decision Trees , 1998, AI Commun..

[13]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[14]  Arno J. Knobbe,et al.  Propositionalisation and Aggregates , 2001, PKDD.

[15]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[16]  Johannes Fürnkranz,et al.  Knowledge Discovery in Databases: PKDD 2006, 10th European Conference on Principles and Practice of Knowledge Discovery in Databases, Berlin, Germany, September 18-22, 2006, Proceedings , 2006, PKDD.

[17]  Saso Dzeroski,et al.  Inductive Logic Programming: Techniques and Applications , 1993 .

[18]  Peter A. Flach,et al.  Comparative Evaluation of Approaches to Propositionalization , 2003, ILP.

[19]  Filip Železný,et al.  HiFi: Tractable Propositionalization through Hierarchical Feature Construction , 2008 .

[20]  Markus Neteler,et al.  Geospatial Free and Open Source Software in the 21st Century: Proceedings of the first Open Source Geospatial Research Symposium, OGRS 2009 , 2012, OGRS.

[21]  Alan M. Frisch,et al.  Generating Numerical Literals During Refinement , 1997, ILP.

[22]  Jan Komorowski,et al.  Principles of Data Mining and Knowledge Discovery , 2001, Lecture Notes in Computer Science.

[23]  Mohammad H. Poursaeidi,et al.  Robust support vector machines for multiple instance learning , 2012, Annals of Operations Research.

[24]  Stefan Wrobel,et al.  Transformation-Based Learning Using Multirelational Aggregation , 2001, ILP.

[25]  Dominique Laurent,et al.  Prétraitement Supervisé des Variables Numériques pour la Fouille de Données Multi-Tables , 2012, EGC.

[26]  Geoffrey I. Webb,et al.  Encyclopedia of Machine Learning , 2011, Encyclopedia of Machine Learning.

[27]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[28]  Celine Vens,et al.  Refining Aggregate Conditions in Relational Learning , 2006, PKDD.