Semantic Frame Embeddings for Detecting Relations between Software Requirements

The early phases of requirements engineering (RE) deal with a vast amount of software requirements (i.e.,requirements that define characteristics of software systems), which are typically expressed in natural language. Analysing such unstructured requirements, usually obtained from stakeholders’ inputs, is considered a challenging task due to the inherent ambiguity and inconsistency of natural language. To support such a task, methods based on natural language processing (NLP) can be employed. One of the more recent advances in NLP is the use of word embeddings for capturing contextual information, which can then be applied in word analogy tasks. In this paper, we describe a new resource, i.e., embedding-based representations of semantic frames in FrameNet, which was developed to support the detection of relations between software requirements. Our embeddings, which encapsulate contextual information at the semantic frame level, were trained on a large corpus of requirements (i.e., a collection of more than three million mobile application reviews). The similarity between these frame embeddings is then used as a basis for detecting semantic relatedness between software requirements. Compared with existing resources underpinned by frame embeddings built upon pre-trained vectors, our proposed frame embeddings obtained better performance against judgments of an RE expert. These encouraging results demonstrate the potential of the resource in supporting RE analysis tasks (e.g., traceability), which we plan to investigate as part of our immediate future work.

[1]  Riza Batista-Navarro,et al.  Towards a Corpus of Requirements Documents Enriched with Semantic Frame Annotations , 2018, 2018 IEEE 26th International Requirements Engineering Conference (RE).

[2]  Nan Niu,et al.  On the role of semantics in automated requirements tracing , 2014, Requirements Engineering.

[3]  Riza Batista-Navarro,et al.  Using semantic frames to identify related textual requirements: an initial validation , 2018, ESEM.

[4]  C. Fillmore FRAME SEMANTICS AND THE NATURE OF LANGUAGE * , 1976 .

[5]  Collin F. Baker FrameNet: Frame Semantic Annotation in Practice , 2017 .

[6]  Sebastian Padó,et al.  Using Embeddings to Compare FrameNet Frames Across Languages , 2018 .

[7]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[8]  Andrea Esuli,et al.  Natural Language Requirements Processing: A 4D Vision , 2017, IEEE Software.

[9]  Roel Wieringa,et al.  Naming the pain in requirements engineering , 2016, Empirical Software Engineering.

[10]  Riza Theresa Batista-Navarro,et al.  A FrameNet-based Approach for Annotating Natural Language Descriptions of Software Requirements , 2018 .

[11]  Wanxiang Che,et al.  Learning Semantic Hierarchies via Word Embeddings , 2014, ACL.

[12]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[13]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[14]  Diomidis Spinellis,et al.  Word Embeddings for the Software Engineering Domain , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[15]  Yonghui Wu,et al.  Exploring the Limits of Language Modeling , 2016, ArXiv.

[16]  Xavier Franch,et al.  Natural Language Processing for Requirements Engineering: The Best Is Yet to Come , 2018, IEEE Software.