Transformation of Rational Data and Set Data to Logic Data

Frequently one wants to extend the use of a classification method that in principle requires records with True/False values, such as decision trees and logic formula constructors, so that records can be processed that contain rational number and/or nominal values. A nominal value is an element or subset of a given finite set. In such cases, the rational numbers or nominal values must first be transformed to True/False values before the method may be applied. This chapter describes methods for the transformation. For nominal entries, the transformation depends on the size of the given finite set and on whether elements or subsets of that set occur. In particular, the scheme for subsets first transforms the entries to rational data. The transformation of rational numbers to True/False values uses a technique called Outpoint that determines abrupt changes of classification cases. The methods of this chapter are rather new and have been found to be effective and reliable in preliminary tests.

[1]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[2]  Peter Auer,et al.  Theory and Applications of Agnostic PAC-Learning with Small Decision Trees , 1995, ICML.

[3]  Klaus Truemper,et al.  A MINSAT Approach for Learning in Logic Domains , 2002, INFORMS J. Comput..

[4]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[5]  Stephen D. Bay Multivariate discretization of continuous variables for set mining , 2000, KDD '00.

[6]  Klaus Truemper,et al.  Design of logic-based intelligent systems , 2004 .

[7]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[8]  Wolfgang Maass,et al.  Efficient agnostic PAC-learning with simple hypothesis , 1994, COLT '94.

[9]  Toshihide Ibaraki,et al.  Logical analysis of numerical data , 1997, Math. Program..

[10]  Marek Kretowski,et al.  An Evolutionary Algorithm Using Multivariate Discretization for Decision Rule Induction , 1999, PKDD.

[11]  Ke Wang,et al.  Minimum Splits Based Discretization for Continuous Features , 1997, IJCAI.

[12]  Nick Cercone,et al.  Discretization of Continuous Attributes for Learning Classification Rules , 1999, PAKDD.

[13]  Usama M. Fayyad,et al.  On the Handling of Continuous-Valued Attributes in Decision Tree Generation , 1992, Machine Learning.

[14]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[15]  Ron Kohavi,et al.  Error-Based and Entropy-Based Discretization of Continuous Features , 1996, KDD.

[16]  Klaus Truemper,et al.  Design of Logic-based Intelligent Systems: Truemper/Intelligent Systems , 2005 .

[17]  Renée J. Miller,et al.  Association rules over interval data , 1997, SIGMOD '97.

[18]  Jean Ponce,et al.  Computer Vision: A Modern Approach , 2002 .

[19]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[20]  Luís Torgo,et al.  Dynamic Discretization of Continuous Attributes , 1998, IBERAMIA.

[21]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[22]  Stephen D. Bay,et al.  Detecting change in categorical data: mining contrast sets , 1999, KDD '99.

[23]  Xindong Wu,et al.  A Bayesian Discretizer for Real-Valued Attributes , 1996, Comput. J..