DB-HReduction: A data preprocessing algorithm for data mining applications

Data preprocessing is an important and critical step in the data mining process and it has a huge impact on the success of a data mining project. In this paper, we present an algorithm DB-HReduction, which discretizes or eliminates numeric attributes and generalizes or eliminates symbolic attributes very efficiently and effectively. This algorithm greatly decreases the number of attributes and tuples of the data set and improves the accuracy and decreases the running time of the data mining algorithms in the later stage.