Software requirements as an application domain for natural language processing

AbstractMapping functional requirements first to specifications and then to code is one of the most challenging tasks in software development. Since requirements are commonly written in natural language, they can be prone to ambiguity, incompleteness and inconsistency. Structured semantic representations allow requirements to be translated to formal models, which can be used to detect problems at an early stage of the development process through validation. Storing and querying such models can also facilitate software reuse. Several approaches constrain the input format of requirements to produce specifications, however they usually require considerable human effort in order to adopt domain-specific heuristics and/or controlled languages. We propose a mechanism that automates the mapping of requirements to formal representations using semantic role labeling. We describe the first publicly available dataset for this task, employ a hierarchical framework that allows requirements concepts to be annotated, and discuss how semantic role labeling can be adapted for parsing software requirements.

[1]  L. Kof NATURAL LANGUAGE PROCESSING FOR REQUIREMENTS ENGINEERING : APPLICABILITY TO LARGE REQUIREMENTS DOCUMENTS , 2004 .

[2]  Axel van Lamsweerde,et al.  Requirements Engineering: From System Goals to UML Models to Software Specifications , 2009 .

[3]  Nasreddine Hallam,et al.  Improving the Quality of Natural Language Requirements Specifications through Natural Language Requirements Patterns , 2006, The Sixth IEEE International Conference on Computer and Information Technology (CIT'06).

[4]  J. Fleiss,et al.  The measurement of interrater agreement , 2004 .

[5]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[6]  Luisa Mich,et al.  Market research for requirements analysis using linguistic tools , 2004, Requirements Engineering.

[7]  Luisa Mich,et al.  NL-OOPS: from natural language to object oriented requirements using the natural language processing system LOLITA , 1996, Natural Language Engineering.

[8]  Razvan C. Bunescu,et al.  Subsequence Kernels for Relation Extraction , 2005, NIPS.

[9]  Mark Steedman,et al.  Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification , 2010, EMNLP.

[10]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[11]  Jian Su,et al.  Exploring Various Knowledge in Relation Extraction , 2005, ACL.

[12]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[13]  Mark Steedman,et al.  The syntactic process , 2004, Language, speech, and communication.

[14]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[15]  Bernd Bohnet,et al.  Top Accuracy and Fast Dependency Parsing is not a Contradiction , 2010, COLING.

[16]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[17]  Richard Johansson,et al.  The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages , 2009, CoNLL Shared Task.

[18]  Daniel Jurafsky,et al.  Shallow Semantic Parsing using Support Vector Machines , 2004, NAACL.

[19]  Michael Roth,et al.  Composition of Word Representations Improves Semantic Role Labelling , 2014, EMNLP.

[20]  Ralph Grishman,et al.  Extracting Relations with Integrated Information Using Kernel Methods , 2005, ACL.

[21]  Alexander Yates,et al.  Large-scale Semantic Parsing via Schema Matching and Lexicon Extension , 2013, ACL.

[22]  Grady Booch,et al.  Object-oriented development , 1986, IEEE Transactions on Software Engineering.

[23]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[24]  Ewan Klein,et al.  Parsing Software Requirements with an Ontology-based Semantic Role Labeler , 2015 .

[25]  Pierre Nugues,et al.  Multilingual Semantic Role Labeling , 2009, CoNLL Shared Task.

[26]  Bernd Bohnet,et al.  Very high accuracy and fast dependency parsing is not a contradiction , 2010, COLING 2010.

[27]  Robert J. Gaizauskas,et al.  CM-Builder: A Natural Language-Based CASE Tool for Object-Oriented Analysis , 2003, Automated Software Engineering.

[28]  Jochen Hoenicke,et al.  Formalization and Analysis of Real-Time Requirements: A Feasibility Study at BOSCH , 2012, VSTTE.

[29]  Nanda Kambhatla,et al.  Combining Lexical, Syntactic, and Semantic Features with Maximum Entropy Models for Information Extraction , 2004, ACL.

[30]  Kyriakos C. Chatzidimitriou,et al.  From requirements to source code: a Model-Driven Engineering approach for RESTful web services , 2017, Automated Software Engineering.

[31]  Benno Geißelmann,et al.  Program Design by Informal English Descriptions , 2001 .

[32]  Daniel Jurafsky,et al.  Automatic Labeling of Semantic Roles , 2002, CL.

[33]  Didar Zowghi,et al.  Reasoning about inconsistencies in natural language requirements , 2005, TSEM.

[34]  Luisa Mich,et al.  Market research for requirements analysis using linguistic tools , 2004, Requirements Engineering.

[35]  Raymond J. Mooney,et al.  Integrating top-down and bottom-up approaches in inductive logic programming: applications in natural language processing and relational data mining , 2003 .

[36]  Raymond J. Mooney,et al.  Learning for Semantic Parsing with Statistical Machine Translation , 2006, NAACL.

[37]  Barry Boehm,et al.  Top 10 list [software development] , 2001 .

[38]  Betty H. C. Cheng,et al.  Facilitating the construction of specification pattern-based properties , 2005, 13th IEEE International Conference on Requirements Engineering (RE'05).

[39]  Nguyen Bach,et al.  A Review of Relation Extraction , 2007 .

[40]  Ewan Klein,et al.  Software Requirements: A new Domain for Semantic Parsers , 2014, ACL 2014.

[41]  Aron Culotta,et al.  Dependency Tree Kernels for Relation Extraction , 2004, ACL.

[42]  Barry W. Boehm,et al.  Software Defect Reduction Top 10 List , 2001, Computer.

[43]  Spencer Rugaber,et al.  Requirements validation via automated natural language parsing , 1995 .

[44]  David Harel,et al.  Generating Executable Scenarios from Natural Language , 2009, CICLing.

[45]  Luke S. Zettlemoyer,et al.  Online Learning of Relaxed CCG Grammars for Parsing to Logical Form , 2007, EMNLP.

[46]  Erik Kamsties,et al.  Higher quality requirements specifications through natural language patterns , 2003, Proceedings 2003 Symposium on Security and Privacy.

[47]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[48]  M. Saeki,et al.  Software Development Process From Natural Language Specification , 1989, 11th International Conference on Software Engineering.

[49]  Razvan C. Bunescu,et al.  A Shortest Path Dependency Kernel for Relation Extraction , 2005, HLT.

[50]  Alexander S. Yeh,et al.  More accurate tests for the statistical significance of result differences , 2000, COLING.