The Use of Bayesian Networks for Web Effort Estimation: Further Investigation

The objective of this paper is to further investigate the use of Bayesian Networks (BN) for Web effort estimation when using a cross-company dataset. Four BNs were built; two automatically using the Hugin tool with two training sets; two using a structure elicited by a domain expert, with parameters obtained from automatically fitting the network to the same training sets used in the automated elicitation (hybrid models). The accuracy of all four models was measured using two validation sets, and point estimates. As a benchmark, the BN-based predictions were also compared to predictions obtained using Manual StepWise Regression (MSWR), and Case-Based Reasoning (CBR). The BN model generated using Hugin presented similar accuracy to CBR and Mean effort-based predictions. Our results suggest that Hybrid BN models can provide significantly superior prediction accuracy. However, good results also seem to depend on characteristics of the training and validation sets used.

[1]  Lionel C. Briand,et al.  A replicated assessment and comparison of common software cost modeling techniques , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[2]  Emilia Mendes,et al.  A Comparison of Length , Complexity and Functionality as Size Measures for Predicting Web Design and Authoring Effort , 2001 .

[3]  Parag C. Pendharkar,et al.  A probabilistic model for predicting software development effort , 2003, IEEE Transactions on Software Engineering.

[4]  Emilia Mendes,et al.  Comparison of Web size measures for predicting Web design and authoring effort , 2002, IEE Proc. Softw..

[5]  Genny Tortora,et al.  Effort estimation modeling techniques: a case study for web applications , 2006, ICWE '06.

[6]  Barbara A. Kitchenham,et al.  A Procedure for Analyzing Unbalanced Datasets , 1998, IEEE Trans. Software Eng..

[7]  Emilia Mendes,et al.  Investigating Web size metrics for early Web cost estimation , 2005, J. Syst. Softw..

[8]  Donald J. Reifer,et al.  Web Development: Estimating Quick-to-Market Software , 2000, IEEE Softw..

[9]  Measurement , 2007 .

[10]  Emilia Mendes,et al.  A comparison of development effort estimation techniques for Web hypermedia applications , 2002, Proceedings Eighth IEEE Symposium on Software Metrics.

[11]  William Marsh,et al.  Making resource decisions for software projects , 2004, Proceedings. 26th International Conference on Software Engineering.

[12]  Emilia Mendes,et al.  Measurement, prediction and risk analysis for Web applications , 2001, Proceedings Seventh International Software Metrics Symposium.

[13]  Stephen G. MacDonell,et al.  What accuracy statistics really measure , 2001, IEE Proc. Softw..

[14]  L. C. van der Gaag,et al.  Building probabilistic networks: Where do the numbers come from? - a guide to the literature , 2000 .

[15]  Emilia Mendes The Use of a Bayesian Network for Web Effort Estimation , 2007, ICWE.

[16]  Emilia Mendes,et al.  Do adaptation rules improve web cost estimation? , 2003, HYPERTEXT '03.

[17]  Kathryn B. Laskey,et al.  Network Engineering for Complex Belief Networks , 1996, UAI.

[18]  Arno J. Knobbe,et al.  Numbers in Multi-relational Data Mining , 2005, PKDD.

[19]  Norman E. Fenton,et al.  Software Measurement: Uncertainty and Causal Modeling , 2002, IEEE Softw..

[20]  Barbara Kitchenham,et al.  A comparison of cross-company and within-company effort estimation models for Web applications , 2004, ICSE 2004.

[21]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[22]  Emilia Mendes,et al.  The application of case-based reasoning to early Web project cost estimation , 2002, Proceedings 26th Annual International Computer Software and Applications.

[23]  D. Ross Jeffery,et al.  Cost estimation for web applications , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[24]  Martin J. Shepperd,et al.  Using simulation to evaluate prediction techniques [for software] , 2001, Proceedings Seventh International Software Metrics Symposium.

[25]  Kevin B. Korb,et al.  Parameterising Bayesian Networks , 2004, Australian Conference on Artificial Intelligence.

[26]  D. Ross Jeffery,et al.  Using public domain metrics to estimate software development effort , 2001, Proceedings Seventh International Software Metrics Symposium.

[27]  Emilia Mendes,et al.  Web Cost Estimation: An Introduction , 2005 .

[28]  Emilia Mendes,et al.  Web development effort estimation using analogy , 2000, Proceedings 2000 Australian Software Engineering Conference.

[29]  H. E. Dunsmore,et al.  Software engineering metrics and models , 1986 .

[30]  Marek J. Druzdzel,et al.  Knowledge Engineering for Very Large Decision-analytic Medical Models , 1999, AMIA.

[31]  Emilia Mendes,et al.  Web Metrics— Estimating and Authoring Effort , 2001 .

[32]  Emilia Mendes,et al.  Cross-company and single-company effort models using the ISBSG database: a further replicated study , 2006, ISESE '06.

[33]  Martin Shepperd,et al.  Using Simulation to Evaluate Prediction Techniques , 2001 .

[34]  Martin Neil,et al.  Building large-scale Bayesian networks , 2000, The Knowledge Engineering Review.

[35]  Ioannis Stamelos,et al.  On the use of Bayesian belief networks for the prediction of software productivity , 2003, Inf. Softw. Technol..

[36]  Lionel C. Briand,et al.  A replicated Assessment of Common Software Cost Estimation Techniques , 2000, ICSE 2000.

[37]  Ioannis Stamelos,et al.  A Simulation Tool for Efficient Analogy Based Cost Estimation , 2000, Empirical Software Engineering.

[38]  S. Lauritzen The EM algorithm for graphical association models with missing data , 1995 .

[39]  Lakhmi C. Jain,et al.  Introduction to Bayesian Networks , 2008 .

[40]  Kathryn B. Laskey,et al.  Network Engineering for Agile Belief Network Models , 2000, IEEE Trans. Knowl. Data Eng..

[41]  Emilia Mendes,et al.  A Comparative Study of Cost Estimation Models for Web Hypermedia Applications , 2003, Empirical Software Engineering.

[42]  Luciano Baresi,et al.  An empirical study on the design effort of Web applications , 2002, Proceedings of the Third International Conference on Web Information Systems Engineering, 2002. WISE 2002..

[43]  Emilia Mendes,et al.  The Need for Web Engineering: An Introduction , 2006, Web Engineering.

[44]  Donald J. Reifer Ten Deadly Risks in Internet and Intranet Software Development , 2002, IEEE Softw..

[45]  Emilia Mendes,et al.  Further comparison of cross-company and within-company effort estimation models for Web applications , 2004 .

[46]  Roberto Paiano,et al.  MMWA: a software sizing model for Web applications , 2003, Proceedings of the Fourth International Conference on Web Information Systems Engineering, 2003. WISE 2003..

[47]  Emilia Mendes,et al.  Web Metrics-Estimating Design and Authoring Effort , 2001, IEEE Multim..

[48]  Emilia Mendes,et al.  Further investigation into the use of CBR and stepwise regression to predict development effort for Web hypermedia applications , 2002, Proceedings International Symposium on Empirical Software Engineering.

[49]  Luciano Baresi,et al.  Estimating the design effort of Web applications , 2003, Proceedings. 5th International Workshop on Enterprise Networking and Computing in Healthcare Industry (IEEE Cat. No.03EX717).

[50]  Frank Bomarius,et al.  COBRA: a hybrid method for software cost estimation, benchmarking, and risk assessment , 1998, Proceedings of the 20th International Conference on Software Engineering.

[51]  Kevin B. Korb,et al.  Bayesian Artificial Intelligence , 2004, Computer science and data analysis series.