Requirement Text Detection from Contract Packages to Support Project Definition Determination

Project requirements are wishes and expectations of the client toward the design, construction, and other project management processes. The project definition is typically specified in a contract package including a contract document and many other related documents such as drawings, specifications, and government codes. Project definition determination is critical to the success of a project. Due to the lack of efficient tools for requirement processing, the current practices regarding project scoping still heavily rely on a manual basis which is tedious, time-consuming, and error-prone. This study aims to fill that gap by developing an automated method for identifying requirement texts from contractual documents. The study employed Naive Bayes to train a classification model that can be used to separate requirement statements from non-requirement statements. An experiment was conducted on a manually labeled dataset of 1191 statements. The results revealed that the developed requirement detection model achieves a promising accuracy of over 90%.

[1]  Chimay J. Anumba,et al.  Client Requirements Processing in Construction: A New Approach Using QFD , 1999 .

[2]  Chimay J. Anumba,et al.  An enterprise architecture framework for electronic requirements information management , 2017, Int. J. Inf. Manag..

[3]  Marta R. Costa-jussà,et al.  Study and Comparison of Rule-Based and Statistical Catalan-Spanish Machine Translation Systems , 2012, Comput. Informatics.

[4]  Nora El-Gohary,et al.  Semantic NLP-Based Information Extraction from Construction Regulatory Documents for Automated Compliance Checking , 2016, J. Comput. Civ. Eng..

[5]  Hai Zhao,et al.  Integrating unsupervised and supervised word segmentation: The role of goodness measures , 2011, Inf. Sci..

[6]  Nora El-Gohary,et al.  Semantic Text Classification for Supporting Automated Compliance Checking in Construction , 2016, J. Comput. Civ. Eng..

[7]  Erik Cambria,et al.  Jumping NLP Curves: A Review of Natural Language Processing Research [Review Article] , 2014, IEEE Computational Intelligence Magazine.

[8]  Chimay J. Anumba,et al.  An empirical study of the complexity of requirements management in construction projects , 2014 .

[9]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[10]  Devesh C. Jinwala,et al.  Resolving Ambiguities in Natural Language Software Requirements: A Comprehensive Survey , 2015, SOEN.

[11]  Kalina Bontcheva,et al.  GATE: an Architecture for Development of Robust HLT applications , 2002, ACL.

[12]  Nora El-Gohary,et al.  Automated Information Transformation for Automated Regulatory Compliance Checking in Construction , 2015, J. Comput. Civ. Eng..

[13]  M Marcus New trends in natural language processing: statistical natural language processing. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Ipek Ozkaya,et al.  Tool support for computer-aided requirement traceability in architectural design: The case of DesignTrack , 2007 .

[15]  Chunyu Kit,et al.  Tokenization as the Initial Phase in NLP , 1992, COLING.

[16]  G. Edward Gibson,et al.  Scope Management Using Project Definition Rating Index , 1997 .

[17]  Nora El-Gohary,et al.  Domain-Specific Hierarchical Text Classification for Supporting Automated Environmental Compliance Checking , 2016, J. Comput. Civ. Eng..

[18]  Nora El-Gohary,et al.  Ontology-Based Multilabel Text Classification of Construction Regulatory Documents , 2016, J. Comput. Civ. Eng..