Use of Similarity Measure to Suggest the Existence of Duplicate User Stories in the Srum Process

In the Scrum process, Product Backlog consists of a prioritized list of desired software functionalities recorded in the form of user stories. As the software product is developed, new functionalities are discovered and included in the Product Backlog. However, in large-scale projects, duplicate stories may arise because of the large number of generated stories, the lack of communication among team members, and due to the speed of development imposed by the Scrum process. In this case, it is important to detect such story as being duplicate, in order to avoid the rework of the software feature. This paper presents an approach that uses semantic similarity measures to suggest possible cases of duplication between user stories. This alert can help Product Owners and Scrum Masters in the decision about excluding duplicate user stories from the Product Backlog.

[1]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[2]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[3]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[4]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[5]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[6]  D. D. Gregorio How the Business Analyst supports and encourages collaboration on agile projects , 2012, 2012 IEEE International Systems Conference SysCon 2012.

[7]  Thabet Slimani,et al.  Description and Evaluation of Semantic Similarity Measures Approaches , 2013, ArXiv.

[8]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[9]  Jez Cope Institutional Data Repository User Stories , 2013 .

[10]  P. Jaccard Distribution de la flore alpine dans le bassin des Dranses et dans quelques régions voisines , 1901 .

[11]  Euripides G. M. Petrakis,et al.  Semantic similarity methods in wordNet and their application to information retrieval on the web , 2005, WIDM '05.

[12]  Ted Pedersen,et al.  Information Content Measures of Semantic Similarity Perform Better Without Sense-Tagged Text , 2010, NAACL.

[13]  Ted Pedersen,et al.  An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet , 2002, CICLing.