Regression-Based Classification Methods and Their Comparisons with Desision Tree Algorithms

Classification learning can be considered as a regression problem with dependent variable consisting of 0s and 1s. Reducing classification to the problem of finding numerical dependencies we gain an opportunity to utilize powerful regression methods implemented in the PolyAnalyst data mining system. Resulting regression functions can be considered as fuzzy membership indicators for a recognized class. In order to obtain classifying rules, the optimum threshold values which minimize the number of misclassified cases can be found for these functions. We show that this approach allows one to solve the over-fit problem satisfactorily and provides results that are at least not worse than results obtained by the most popular decision tree algorithms.