Detecting User Story Information in Developer-Client Conversations to Generate Extractive Summaries

User stories are descriptions of functionality that a software user needs. They play an important role in determining which software requirements and bug fixes should be handled and in what order. Developers elicit user stories through meetings with customers. But user story elicitation is complex, and involves many passes to accommodate shifting and unclear customer needs. The result is that developers must take detailed notes during meetings or risk missing important information. Ideally, developers would be freed of the need to take notes themselves, and instead speak naturally with their customers. This paper is a step towards that ideal. We present a technique for automatically extracting information relevant to user stories from recorded conversations between customers and developers. We perform a qualitative study to demonstrate that user story information exists in these conversations in a sufficient quantity to extract automatically. From this, we found that roughly 10.2% of these conversations contained user story information. Then, we test our technique in a quantitative study to determine the degree to which our technique can extract user story information. In our experiment, our process obtained about 70.8% precision and 18.3% recall on the information.

[1]  Jade Goldstein-Stewart,et al.  Summarizing text documents: sentence selection and evaluation metrics , 1999, SIGIR '99.

[2]  Andrian Marcus,et al.  On the Use of Automated Text Summarization Techniques for Summarizing Source Code , 2010, 2010 17th Working Conference on Reverse Engineering.

[3]  Westley Weimer,et al.  Automatic documentation inference for exceptions , 2008, ISSTA '08.

[4]  Giuseppe Carenini,et al.  Summarizing Spoken and Written Conversations , 2008, EMNLP.

[5]  Collin McMillan,et al.  Automatic Source Code Summarization of Context for Java Methods , 2016, IEEE Transactions on Software Engineering.

[6]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[7]  Regina Barzilay,et al.  Information Fusion in the Context of Multi-Document Summarization , 1999, ACL.

[8]  Raymond P. L. Buse,et al.  A metric for software readability , 2008, ISSTA '08.

[9]  Jean Carletta,et al.  The AMI meeting corpus , 2005 .

[10]  Mark Gall,et al.  Towards a Framework for Real Time Requirements Elicitation , 2006, 2006 First International Workshop on Multimedia Requirements Engineering (MERE'06 - RE'06 Workshop).

[11]  Lori L. Pollock,et al.  Automatically detecting and describing high level actions within methods , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[12]  E. Schegloff Sequence Organization in Interaction: Contents , 2007 .

[13]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[14]  Jeff Sutherland,et al.  Manifesto for Agile Software Development , 2013 .

[15]  Laura Moreno Summarization of complex software artifacts , 2014, ICSE Companion.

[16]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[17]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[18]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[19]  Yiming Yang,et al.  The Enron Corpus: A New Dataset for Email Classi(cid:12)cation Research , 2004 .

[20]  Toyoaki Nishida Conversational Informatics: An Engineering Approach (Wiley Series in Agent Technology) , 2008 .

[21]  Sandra A. Thompson,et al.  The language of turn and sequence , 2002 .

[22]  Gail C. Murphy,et al.  Automatic Summarization of Bug Reports , 2014, IEEE Transactions on Software Engineering.

[23]  Emily Hill,et al.  Towards automatically generating summary comments for Java methods , 2010, ASE.

[24]  Giuseppe Carenini,et al.  Summarizing Emails with Conversational Cohesion and Subjectivity , 2008, ACL.

[25]  Lisa F. Rau,et al.  Information extraction and text summarization using linguistic knowledge acquisition , 1989, Inf. Process. Manag..

[26]  Jairo Aponte,et al.  On the Analysis of Human and Automatic Summaries of Source Code , 2012, CLEI Electron. J..

[27]  Mike Cohn,et al.  User Stories Applied: For Agile Software Development , 2004 .

[28]  Dragomir R. Radev,et al.  Introduction to the Special Issue on Summarization , 2002, CL.

[29]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[30]  Gurpreet Singh Lehal,et al.  A Survey of Text Summarization Extractive Techniques , 2010 .

[31]  Gail C. Murphy,et al.  Summarizing software artifacts: a case study of bug reports , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[32]  Lori L. Pollock,et al.  Generating Parameter Comments and Integrating with Method Summaries , 2011, 2011 IEEE 19th International Conference on Program Comprehension.

[33]  James Robertson,et al.  Mastering the Requirements Process: Getting Requirements Right , 2012 .

[34]  P. Have Doing conversation analysis , 2007 .

[35]  Lori L. Pollock,et al.  Automatic generation of natural language summaries for Java classes , 2013, 2013 21st International Conference on Program Comprehension (ICPC).

[36]  Jane Cleland-Huang,et al.  The Detection and Classification of Non-Functional Requirements with Application to Early Aspects , 2006, 14th IEEE International Requirements Engineering Conference (RE'06).

[37]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..