Automated Identification of Security Requirements: A Machine Learning Approach

Early characterization of security requirements supports system designers to integrate security aspects into early architectural design. However, distinguishing security related requirements from other functional and non-functional requirements can be tedious and error prone. To address this issue, machine learning techniques have proven to be successful in the identification of security requirements. In this paper, we have conducted an empirical study to evaluate the performance of 22 supervised machine learning classification algorithms and two deep learning approaches, in classifying security requirements, using the publicly availble SecReq dataset. More specifically, we focused on the robustness of these techniques with respect to the overhead of the pre-processing step. Results show that Long short-term memory (LSTM) network achieved the best accuracy (84%) among non-supervised algorithms, while Boosted Ensemble achieved the highest accuracy (80%), among supervised algorithms.

[1]  Daniel Mellado,et al.  A systematic review of security requirements engineering , 2010, Comput. Stand. Interfaces.

[2]  László Tóth,et al.  Study of Various Classifiers for Identification and Classification of Non-functional Requirements , 2018, ICCSA.

[3]  Florence Sèdes,et al.  Using k-Means for Redundancy and Inconsistency Detection: Application to Industrial Requirements , 2018, NLDB.

[4]  Jan Jürjens,et al.  Supporting Requirements Engineers in Recognising Security Issues , 2011, REFSQ.

[5]  Walid Maalej,et al.  Automatically Classifying Functional and Non-functional Requirements Using Supervised Machine Learning , 2017, 2017 IEEE 25th International Requirements Engineering Conference (RE).

[6]  Anthony Lins,et al.  A Systematic Approach of Dataset Definition for a Supervised Machine Learning Using NFR Framework , 2018, 2018 11th International Conference on the Quality of Information and Communications Technology (QUATIC).

[7]  Stefan Wagner,et al.  Rapid quality assurance with Requirements Smells , 2016, J. Syst. Softw..

[8]  Yinglin Wang,et al.  A linear classifier based approach for identifying security requirements in open source software development , 2019, J. Ind. Inf. Integr..

[9]  László Vidács,et al.  Comparative Study of The Performance of Various Classifiers in Labeling Non-Functional Requirements , 2019, Inf. Technol. Control..

[10]  Laurie A. Williams,et al.  Hidden in plain sight: Automatically identifying security requirements from natural language artifacts , 2014, 2014 IEEE 22nd International Requirements Engineering Conference (RE).

[11]  Ruchika Malhotra,et al.  Automated classification of security requirements , 2016, 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[12]  Jonas Paul Winkler,et al.  Using Tools to Assist Identification of Non-requirements in Requirements Specifications - A Controlled Experiment , 2018, REFSQ.

[13]  Abdulrahman Mirza,et al.  Toward Automated Software Requirements Classification , 2018, 2018 21st Saudi Computer Society National Computer Conference (NCC).

[14]  Tong Li,et al.  Identifying Security Requirements Based on Linguistic Analysis and Machine Learning , 2017, 2017 24th Asia-Pacific Software Engineering Conference (APSEC).

[15]  Alex Dekhtyar,et al.  RE Data Challenge: Requirements Identification with Word2Vec and TensorFlow , 2017, 2017 IEEE 25th International Requirements Engineering Conference (RE).

[16]  Asma Sellami,et al.  Towards a Software Requirements Change Classification using Support Vector Machine , 2018, LPKM.

[17]  Jan Jürjens,et al.  Eliciting security requirements and tracing them to design: an integration of Common Criteria, heuristics, and UMLsec , 2010, Requirements Engineering.