论文信息 - Proposal and Empirical Comparison of a Parallelizable Distance-Based Discretization Method

Proposal and Empirical Comparison of a Parallelizable Distance-Based Discretization Method

Many classification algorithms are designed to work with datasets that contain only discrete attributes. Discretization is the process of converting the continuous attributes of the dataset into discrete ones in order to apply some classification algorithm. In this paper we first review previous work in discretization, then we propose a new discretization method based on a distance proposed by Lopez de Mantaras and show that it can be easily implemented in parallel, with a high improvement in its complexity. Finally we empirically show that our method has an excellent performance compared with other state-of-the-art methods.

Ramón López de Mántaras | Jesús Cerquides | R. Mántaras | J. Cerquides

[1] Ron Kohavi,et al. Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[2] Jean Dickinson Gibbons,et al. Nonparametric Statistical Inference , 1972, International Encyclopedia of Statistical Science.

[3] Usama M. Fayyad,et al. Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[4] Jason Catlett,et al. On Changing Continuous Attributes into Ordered Discrete Attributes , 1991, EWSL.

[5] Randy Kerber,et al. ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[6] P. Langley,et al. An Analysis of Bayesian Classifiers , 1992, AAAI.

[7] Pat Langley,et al. An Analysis of Bayesian Classifiers , 1992, AAAI.