An Investigation of Rule Induction Based Prediction Systems

Traditionally, researchers have used either off-the-shelf models such as COCOMO, or developed local models using statistical techniques such as stepwise regression, to predict software effort estimates. More recently, attention has turned to a variety of machine learning methods such as artificial neural networks (ANNs), case-based reasoning (CBR) and rule induction (RI). This position paper outlines some preliminary research into the use of rule induction methods to build software cost models. We briefly describe the use of rule induction methods and then apply the technique to a dataset of 81 software projects derived from a Canadian software house in the late 1980s. We show that RI methods tend to be unstable and generally predict with quite variable accuracy. Pruning the feature set, however, has a significant impact upon accuracy. We also compare our results with a prediction system based upon a standard regression procedure. We suggest that further work is carried out to examine the effects of the relationships among, and between, the features of the attributes on the generated rules in an attempt to improve on current prediction techniques and enhance our understanding of machine learning methods.