A novel data mining approach for soil Classification

Decision tree is a well known approach for classification in data mining. C4.5 and Classification and Regression Trees (CART) are two widely used decision tree algorithms for classification. The main drawback of C4.5 algorithm is that, it is biased towards attributes with more values while CART algorithm produces misclassification errors when the domain of the target attribute is very large. In view of these limitations, this paper presents a modified decision tree algorithm. The C4.5, CART and the proposed classifier are trained using data set containing soil samples by considering optimal soil parameters namely pH (power of Hydrogen), Ec (Electrical Conductivity) and ESP (Exchangeable Sodium Percentage). The model is tested with test data set of soil samples. The test proves that the modified decision tree algorithm has higher classification accuracy when compared to C4.5 and CART algorithms.