Application of back-translation: a transfer learning approach to identify ambiguous software requirements

Ambiguous requirements are problematic in requirement engineering as various stakeholders can debate on the interpretation of the requirements leading to a variety of issues in the development stages. Since requirement specifications are usually written in natural language, analyzing ambiguous requirements is currently a manual process as it has not been fully automated to meet the industry standards. In this paper, we used transfer learning by using ULMFiT where we pre-trained our model to a general-domain corpus and then fine-tuned it to classify ambiguous vs unambiguous requirements (target task). We then compared its accuracy with machine learning classifiers like SVM, Linear Regression, and Multinomial Naive Bayes. We also used back translation (BT) as a text augmentation technique to see if it improved the classification accuracy. Our results showed that ULMFiT achieved higher accuracy than SVM (Support Vector Machines), Logistic Regression and Multinomial Naive Bayes for our initial data set. Further by augmenting requirements using BT, ULMFiT got a higher accuracy than SVM, Logistic Regression, and Multinomial Naive Bayes classifier, improving the initial performance by 5.371%. Our proposed research provides some promising insights on how transfer learning and text augmentation can be applied to small data sets in requirements engineering.

[1]  Richard Socher,et al.  Regularizing and Optimizing LSTM Language Models , 2017, ICLR.

[2]  Maninder Singh,et al.  Automated Validation of Requirement Reviews: A Machine Learning Approach , 2018, 2018 IEEE 26th International Requirements Engineering Conference (RE).

[3]  Maninder Singh,et al.  Using Supervised Learning to Guide the Selection of Software Inspectors in Industry , 2018, 2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW).

[4]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[5]  Gursimran Singh Walia,et al.  Using Eye Tracking to Investigate Reading Patterns and Learning Styles of Software Requirement Inspectors to Enhance Inspection Team Outcome , 2016, ESEM.

[6]  Bashar Nuseibeh,et al.  Analysing anaphoric ambiguity in natural language requirements , 2011, Requirements Engineering.

[7]  Suzan Verberne,et al.  The merits of Universal Language Model Fine-tuning for Small Datasets - a case with Dutch book reviews , 2019, ArXiv.

[8]  Leslie N. Smith,et al.  A disciplined approach to neural network hyper-parameters: Part 1 - learning rate, batch size, momentum, and weight decay , 2018, ArXiv.

[9]  Gursimran S. Walia,et al.  Teaching Software Requirements Inspections to Software Engineering Students through Practical Training and Reflection , 2016 .

[10]  Foutse Khomh,et al.  Is it a bug or an enhancement?: a text-based approach to classify change requests , 2008, CASCON '08.

[11]  Stefania Gnesi,et al.  Detecting requirements defects with NLP patterns: an industrial experience in the railway domain , 2018, Empirical Software Engineering.

[12]  Wiebke Wagner,et al.  Steven Bird, Ewan Klein and Edward Loper: Natural Language Processing with Python, Analyzing Text with the Natural Language Toolkit , 2010, Lang. Resour. Evaluation.

[13]  Akito Monden,et al.  Mining software repositories , 2013 .

[14]  Kai Zou,et al.  EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks , 2019, EMNLP.

[15]  Christopher J. Lowrance,et al.  Effect of training set size on SVM and Naive Bayes for Twitter sentiment analysis , 2015, 2015 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT).

[16]  Jeffrey C. Carver,et al.  Development of a human error taxonomy for software requirements: A systematic literature review , 2018, Inf. Softw. Technol..

[17]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[18]  Urvashi Rathod,et al.  Using Learning Styles to Staff and Improve Software Inspection Team Performance , 2016, 2016 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW).

[19]  Barbara Paech,et al.  Detecting Ambiguities in Requirements Documents Using Inspections , 2001 .

[20]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Maninder Singh,et al.  Validation of Inspection Reviews over Variable Features Set Threshold , 2017, 2017 International Conference on Machine Learning and Data Science (MLDS).

[22]  Philipp Koehn,et al.  Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2016 .

[23]  Nan Hua,et al.  Universal Sentence Encoder , 2018, ArXiv.

[24]  Abhinav Singh,et al.  Using Learning Styles of Software Professionals to Improve their Inspection Team Performance , 2015, SEKE.

[25]  Daniel M. Berry,et al.  The Design of SREE - A Prototype Potential Ambiguity Finder for Requirements Specifications and Lessons Learned , 2013, REFSQ.

[26]  Susan T. Dumais,et al.  A Bayesian Approach to Filtering Junk E-Mail , 1998, AAAI 1998.

[27]  Jimmy J. Lin,et al.  Rethinking Complex Neural Network Architectures for Document Classification , 2019, NAACL.

[28]  Sam Shleifer Low Resource Text Classification with ULMFit and Backtranslation , 2019, ArXiv.

[29]  Francis Chantree,et al.  Identifying Nocuous Ambiguities in Natural Language Requirements , 2006, 14th IEEE International Requirements Engineering Conference (RE'06).

[30]  Maninder Singh,et al.  Validating Requirements Reviews by Introducing Fault-Type Level Granularity: A Machine Learning Approach , 2018, ISEC.

[31]  Stefania Gnesi,et al.  Detecting Domain-Specific Ambiguities: An NLP Approach Based on Wikipedia Crawling and Word Embeddings , 2017, 2017 IEEE 25th International Requirements Engineering Conference Workshops (REW).

[32]  Christian Bird,et al.  Characteristics of Useful Code Reviews: An Empirical Study at Microsoft , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[33]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[34]  Benedikt Gleich,et al.  Ambiguity Detection: Towards a Tool Explaining Ambiguity Sources , 2010, REFSQ.

[35]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.