An integral Chi2 algorithm for discretization of real value attributes

The ChiMerg algorithm and its extensions have been shown to be efficient and effective for discretization of continuous attributes. However, all these algorithms have a vital drawback, i.e., the sense of probability is not fully carried out in two merged intervals. To overcome this drawback, this paper proposes the Integral Chi2 Algorithm based on the Chi2 algorithm. In the proposed algorithm, the meanings of probability and statistics are associated with criterion of interval merging. Extensive experiments are conducted to evaluate the performance of the proposed algorithm by comparing with existing algorithms. The experimental results show that our algorithm outperforms existing algorithms in overall.

[1]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[2]  Yuxin Zhou,et al.  Treatment Method after Discretization of Continuous Attributes Based on Attributes Importance and Samples Entropy , 2011, 2011 Fourth International Conference on Intelligent Computation Technology and Automation.

[3]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[4]  Michel Loève,et al.  Probability Theory I , 1977 .

[5]  Hung Son Nguyen,et al.  Discretization Problem for Rough Sets Methods , 1998, Rough Sets and Current Trends in Computing.

[6]  Francis Eng Hock Tay,et al.  A Modified Chi2 Algorithm for Discretization , 2002, IEEE Trans. Knowl. Data Eng..

[7]  Chao-Ton Su,et al.  An Extended Chi2 Algorithm for Discretization of Real Value Attributes , 2005, IEEE Trans. Knowl. Data Eng..

[8]  Randy Kerber,et al.  ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[9]  Yu-Lin He,et al.  Particle swarm optimization for determining fuzzy measures from data , 2011, Inf. Sci..

[10]  Béatrice Duval,et al.  A non-parametric semi-supervised discretization method , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[11]  Huan Liu,et al.  Feature Selection via Discretization , 1997, IEEE Trans. Knowl. Data Eng..

[12]  Xi-Zhao Wang,et al.  Improving Generalization of Fuzzy IF--THEN Rules by Maximizing Fuzzy Entropy , 2009, IEEE Transactions on Fuzzy Systems.