Automated Validation of Requirement Reviews: A Machine Learning Approach

Software development is fault-prone especially during the fuzzy phases (requirements and design). Software inspections are commonly used in industry to detect and fix problems in requirements and design artifacts thereby mitigating the fault propagation to later phases where same faults are harder to find and fix. The output of an inspection process is natural language (NL) reviews that report the location and description of faults in software requirements specification document (SRS). The artifact author must manually read through the reviews and differentiate between true-faults and false-positives before fixing the faults. The time spent in making effective post-inspection decisions (number of true faults and deciding whether to re-inspect) could be spent in doing actual development work. The goal of this research is to automate the validation of inspection reviews, finding common patterns that describe high-quality requirements, identify fault prone requirements pre-inspection, and interrelated requirements to assist fixation of reported faults post-inspection. To accomplish these goals, this research employs various classification approaches, NL processing with semantic analysis and mining solutions from graph theory to requirement reviews and NL requirements. Initial results w.r.t. validation of inspection reviews have shown that our proposed approaches were able to successfully categorize useful and non-useful reviews.

[1]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[2]  Maninder Singh,et al.  Validating Requirements Reviews by Introducing Fault-Type Level Granularity: A Machine Learning Approach , 2018, ISEC.

[3]  Kristina Winbladh,et al.  Analysis of user comments: An approach for software requirements evolution , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[4]  Hien M. Nguyen,et al.  A comparative study on sampling techniques for handling class imbalance in streaming data , 2012, The 6th International Conference on Soft Computing and Intelligent Systems, and The 13th International Symposium on Advanced Intelligence Systems.

[5]  Richard N. Taylor,et al.  Software traceability with topic modeling , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[6]  Christian Bird,et al.  Characteristics of Useful Code Reviews: An Empirical Study at Microsoft , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[7]  Zhendong Niu,et al.  Solving the class imbalance problems using RUSMultiBoost ensemble , 2015, 2015 10th Iberian Conference on Information Systems and Technologies (CISTI).

[8]  Gabriele Bavota,et al.  Mining Unstructured Data in Software Repositories: Current and Future Trends , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[9]  Brendan T. O'Connor,et al.  Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments , 2010, ACL.

[10]  Maninder Singh,et al.  An Empirical Investigation to Overcome Class-Imbalance in Inspection Reviews , 2017, 2017 International Conference on Machine Learning and Data Science (MLDS).

[11]  Mehrdad Sabetzadeh,et al.  Automated Extraction and Clustering of Requirements Glossary Terms , 2017, IEEE Transactions on Software Engineering.

[12]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[13]  Christopher J. Lowrance,et al.  Effect of training set size on SVM and Naive Bayes for Twitter sentiment analysis , 2015, 2015 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT).

[14]  Yong Rae Kwon,et al.  An empirical evaluation of six methods to detect faults in software , 2002, Softw. Test. Verification Reliab..

[15]  Bogdan Dit,et al.  Measuring the Semantic Similarity of Comments in Bug Reports , 2008 .

[16]  Meng Wang,et al.  Improving Short Text Classification through Better Feature Space Selection , 2013, 2013 Ninth International Conference on Computational Intelligence and Security.

[17]  KiranKumar Marri Models for evaluating review effectiveness , 2001 .

[18]  Maninder Singh,et al.  Validation of Inspection Reviews over Variable Features Set Threshold , 2017, 2017 International Conference on Machine Learning and Data Science (MLDS).

[19]  Graph Mining, Social Network Analysis, and Multirelational Data Mining , 2022 .

[20]  Ning Chen,et al.  AR-miner: mining informative reviews for developers from mobile app marketplace , 2014, ICSE.

[21]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .