Using grey relational analysis to predict software effort with small data sets

The inherent uncertainty of the software development process presents particular challenges for software effort prediction. We need to systematically address missing data values, feature subset selection and the continuous evolution of predictions as the project unfolds, and all of this in the context of data-starvation and noisy data. However, in this paper, we particularly focus on feature subset selection and effort prediction at an early stage of a project. We propose a novel approach of using grey relational analysis (GRA) of grey system theory (GST), which is a recently developed system engineering theory based on the uncertainty of small samples. In this work we address some of the theoretical challenges in applying GRA to feature subset selection and effort prediction, and then evaluate our approach on five publicly available industrial data sets using stepwise regression as a benchmark. The results are very encouraging in the sense of being comparable or better than other machine learning techniques and thus indicate that the method has considerable potential

[1]  Victor R. Basili,et al.  A Pattern Recognition Approach for Software Engineering Data Analysis , 1992, IEEE Trans. Software Eng..

[2]  Magne Jørgensen,et al.  Software effort estimation by analogy and "regression toward the mean" , 2003, J. Syst. Softw..

[3]  Lionel C. Briand,et al.  A replicated assessment and comparison of common software cost modeling techniques , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[4]  Colin J Burgess,et al.  Can genetic programming improve software effort estimation? A comparative evaluation , 2001, Inf. Softw. Technol..

[5]  Kjetil Molkken,et al.  A Review of Surveys on Software Effort Estimation , 2003 .

[6]  E. GaffneyJ.,et al.  Software Function, Source Lines of Code, and Development Effort Prediction , 1983 .

[7]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Shiuh-Jer Huang,et al.  Control of an inverted pendulum using grey prediction model , 1994, Proceedings of 1994 IEEE Industry Applications Society Annual Meeting.

[9]  Jer-Min Jou,et al.  The gray prediction search algorithm for block motion estimation , 1999, IEEE Trans. Circuits Syst. Video Technol..

[10]  David Ellison,et al.  Software cost estimation using an Albus perceptron (CMAC) , 1997, Inf. Softw. Technol..

[11]  Barry W. Boehm,et al.  Software Engineering Economics , 1993, IEEE Transactions on Software Engineering.

[12]  Lawrence H. Putnam,et al.  A General Empirical Solution to the Macro Software Sizing and Estimating Problem , 1978, IEEE Transactions on Software Engineering.

[13]  Lionel C. Briand,et al.  Resource modeling in software engineering , 2002 .

[14]  Michael J. Prietula,et al.  Examining the Feasibility of a Case-Based Reasoning Model for Software Effort Estimation , 1992, MIS Q..

[15]  Jean-Marc Desharnais,et al.  A comparison of software effort estimation techniques: Using function points with neural networks, case-based reasoning and regression models , 1997, J. Syst. Softw..

[16]  Yu Ted Su,et al.  Frequency acquisition and tracking in high dynamic environments , 2000, IEEE Trans. Veh. Technol..

[17]  Gavin R. Finnie,et al.  Using Artificial Neural Networks and Function Points to Estimate 4GL Software Development Effort , 1994, Australas. J. Inf. Syst..

[18]  Isabella Wieczorek,et al.  Resource Estimation in Software Engineering , 2002 .

[19]  Yi-Fan Wang,et al.  On-Demand Forecasting of Stock Prices Using a Real-Time Predictor , 2003, IEEE Trans. Knowl. Data Eng..

[20]  Ren C. Luo,et al.  Target tracking using a hierarchical grey-fuzzy motion decision-making method , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[21]  Lionel C. Briand,et al.  Using the European Space Agency data set: a replicated assessment and comparison of common software , 2000 .

[22]  John E. Gaffney,et al.  Software Function, Source Lines of Code, and Development Effort Prediction: A Software Science Validation , 1983, IEEE Transactions on Software Engineering.

[23]  Bernard C. Jiang,et al.  Machine vision-based gray relational theory applied to IC marking inspection , 2002 .

[24]  Chris F. Kemerer,et al.  An empirical validation of software cost estimation models , 1987, CACM.

[25]  Keith Phalp,et al.  An investigation of machine learning based prediction systems , 2000, J. Syst. Softw..

[26]  Douglas Fisher,et al.  Machine Learning Approaches to Estimating Software Development Effort , 1995, IEEE Trans. Software Eng..

[27]  Martin J. Shepperd,et al.  Estimating Software Project Effort Using Analogies , 1997, IEEE Trans. Software Eng..

[28]  Szu-Lin Su,et al.  Grey-based power control for DS-CDMA cellular mobile systems , 2000, IEEE Trans. Veh. Technol..

[29]  Deng Ju-Long,et al.  Control problems of grey systems , 1982 .

[30]  Tim Menzies,et al.  Validation methods for calibrating software effort models , 2005, ICSE.