SPt: A Text Mining Process to Extract Relevant Areas from SW Documents to Exploratory Tests

Software products must show high-quality levels to succeed in a competitive market. Usually, products reliability is assured by testing activities. However, SW testing is sometimes neglected by Companies due to its high costs - particularly when manually executed. In this light, this work investigates intelligent methods for SW testing automation, focusing on the software products review process. We propose a new process for test plan creation based on the inspection of SW documents (in particular, Release Notes) using text mining techniques. The implemented prototype, the SWAT Plan tool (SPt), automatically extracts from Release Notes relevant areas of the SW to be examined by exploratory tests teams. SPt was tested using real-world data from Motorola Mobility, our partner Company. The experiments compared the current manual process with the automated process using SPt, accessing time spent and relevant areas identified in both methods. The obtained results were very encouraging.

[1]  Tomi Männistö,et al.  Persuading Software Development Teams to Document Inspections: Success Factors and Challenges in Practice , 2010, 2010 18th IEEE International Requirements Engineering Conference.

[2]  Gabriele Bavota,et al.  ARENA: An Approach for the Automated Generation of Release Notes , 2017, IEEE Transactions on Software Engineering.

[3]  Ken-ichi Matsumoto,et al.  Classifying Bug Reports to Bugs and Other Requests Using Topic Modeling , 2013, 2013 20th Asia-Pacific Software Engineering Conference (APSEC).

[4]  Alberto Rodrigues da Silva Quality of Requirements Specifications - A Framework for Automatic Validation of Requirements , 2014, ICEIS.

[5]  Ken-ichi Matsumoto,et al.  Comparing hierarchical dirichlet process with latent dirichlet allocation in bug report multiclass classification , 2014, 15th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD).

[6]  Hong Mei,et al.  A survey on bug-report analysis , 2015, Science China Information Sciences.

[7]  Andrew McCallum,et al.  Text Classification by Bootstrapping with Keywords, EM and Shrinkage , 1999 .

[8]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[9]  Andreas Zeller,et al.  It's not a bug, it's a feature: How misclassification impacts bug prediction , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[10]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[11]  Gabe. Ignatow,et al.  An Introduction to Text Mining: Research Design, Data Collection, and Analysis , 2017 .

[12]  Ricardo B. C. Prudêncio,et al.  AVS: An approach to identifying and mitigating duplicate bug reports , 2018, SBSI'18.

[13]  Sholom M. Weiss,et al.  Automated learning of decision rules for text categorization , 1994, TOIS.

[14]  Ahmed E. Hassan,et al.  An empirical study of software release notes , 2015, Empirical Software Engineering.

[15]  Robert L. Glass,et al.  Software Testing and Industry Needs , 2006, IEEE Softw..

[16]  Mika Mäntylä,et al.  What Types of Defects Are Really Discovered in Code Reviews? , 2009, IEEE Transactions on Software Engineering.

[17]  Per Runeson,et al.  Verification and validation in industry - a qualitative survey on the state of practice , 2002, Proceedings International Symposium on Empirical Software Engineering.

[18]  Foutse Khomh,et al.  Is it a bug or an enhancement?: a text-based approach to classify change requests , 2008, CASCON '08.

[19]  Stefan Biffl,et al.  Software Reviews: The State of the Practice , 2003, IEEE Softw..

[20]  Mika V. Mäntylä,et al.  How are software defects found? The role of implicit defect detection, individual responsibility, documents, and knowledge , 2014, Inf. Softw. Technol..

[21]  Ahmed E. Hassan,et al.  A survey on the use of topic models when mining software repositories , 2015, Empirical Software Engineering.

[22]  Ricardo Baeza-Yates,et al.  Modern Information Retrieval - the concepts and technology behind search, Second edition , 2011 .

[23]  Artem Katasonov,et al.  Requirements quality control: a unifying framework , 2005, Requirements Engineering.

[24]  Hideaki Hata,et al.  Bug or Not? Bug Report Classification Using N-Gram IDF , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[25]  James H. Martin,et al.  Speech and Language Processing, 2nd Edition , 2008 .

[26]  David B. Martin,et al.  'Good' Organisational Reasons for 'Bad' Software Testing: An Ethnographic Study of Testing in a Small Software Company , 2007, 29th International Conference on Software Engineering (ICSE'07).