Estimating Story Points from Issue Reports

Estimating the effort of software engineering tasks is notoriously hard but essential for project planning. The agile community often adopts issue reports to describe tasks, and story points to estimate task effort. In this paper, we propose a machine learning classifier for estimating the story points required to address an issue. Through empirical evaluation on one industrial project and eight open source projects, we demonstrate that such classifier is feasible. We show that ---after an initial training on over 300 issue reports--- the classifier estimates a new issue in less than 15 seconds with a mean magnitude of relative error between 0.16 and 0.61. In addition, issue type, summary, description, and related components prove to be project dependent features pivotal for story point estimation.

[1]  Witold Pedrycz,et al.  Predicting Development Effort from User Stories , 2011, 2011 International Symposium on Empirical Software Engineering and Measurement.

[2]  Bram Adams,et al.  Do developers feel emotions? an exploratory analysis of emotions in software artifacts , 2014, MSR 2014.

[3]  Robert T. Hughes,et al.  Expert judgement as an estimating method , 1996, Inf. Softw. Technol..

[4]  Rupert Brown,et al.  Group Processes: Dynamics Within and Between Groups , 1988 .

[5]  Jürgen Münch,et al.  State of the Practice in Software Effort Estimation: A Survey and Literature Review , 2008, CEE-SET.

[6]  Tina Klančar,et al.  Effort estimation in agile software development - A systematic literature review , 2017 .

[7]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[8]  Andreas Zeller,et al.  How Long Will It Take to Fix This Bug? , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[9]  Mohak Shah,et al.  Evaluating Learning Algorithms: A Classification Perspective , 2011 .

[10]  Mohamed Kholief,et al.  Improving bug fix-time prediction model by filtering out outliers , 2013, 2013 The International Conference on Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE).

[11]  Ying Zou,et al.  Studying the fix-time for bugs in large open source projects , 2011, Promise '11.

[12]  Agile Manifesto,et al.  Manifesto for Agile Software Development , 2001 .

[13]  Qinbao Song,et al.  Software defect association mining and defect correction effort prediction , 2006, IEEE Transactions on Software Engineering.

[14]  Michele Marchesi,et al.  On the influence of maintenance activity types on the issue resolution time , 2014, PROMISE.

[15]  Serge Demeyer,et al.  Among the Machines: Human-Bot Interaction on Social Q&A Websites , 2016, CHI Extended Abstracts.

[16]  Steve M. Easterbrook,et al.  Anchoring and adjustment in software estimation , 2005, ESEC/FSE-13.

[17]  Nils Christian Haugen An empirical study of using planning poker for user story estimation , 2006, AGILE 2006 (AGILE'06).

[18]  Giancarlo Succi,et al.  An empirical study of open-source and closed-source software products , 2004, IEEE Transactions on Software Engineering.

[19]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[20]  Lionel C. Briand On the many ways software engineering can benefit from knowledge engineering , 2002, SEKE '02.

[21]  Thomas J. Watson,et al.  An empirical study of the naive Bayes classifier , 2001 .

[22]  Tarek K. Abdel-Hamid,et al.  Investigating the cost/schedule trade-off in software development , 1990, IEEE Software.

[23]  Tom DeMarco,et al.  Controlling Software Projects: Management, Measurement, and Estimates , 1986 .

[24]  Hui Zeng,et al.  Estimation of software defects fix effort using neural networks , 2004, Proceedings of the 28th Annual International Computer Software and Applications Conference, 2004. COMPSAC 2004..

[25]  Richard C. Atkinson,et al.  Hilgard's introduction to psychology, 12th ed. , 1996 .