Scalable classifiers with dynamic pruning

The paper presents an algorithm to solve the problem of classification for data mining applications. This is a decision tree classifier which uses modified gini index as the partitioning criteria. A pre-sorting technique is used to overcome the problem of sorting at each node of the tree. This technique is integrated with a breadth first tree growth strategy which enables us to calculate the best partition for each of the leaf nodes in a single scan of a database. We have implemented this algorithm using depth first tree growth strategy also. The algorithm uses a dynamic pruning approach which reduces the number of scans of the database and does away with a separate tree pruning phase. The proof of correctness, analysis and performance study are also presented.