A Machine-Learning Approach for Semantic Matching of Building Codes and Building Information Models (BIMs) for Supporting Automated Code Checking

Various automated code compliance checking (ACC) systems have been developed and used to check the compliance of building information models (BIMs) with building codes, to reduce the time, cost, and errors of the code compliance checking process. All these systems require some form of code-BIM matching – matching of the concept representations in the codes to those in the BIMs – which is a difficult task. Traditionally, semantic matching was conducted in a highly-manual manner. To address this problem, more recently, a limited number of efforts have proposed fully automated semantic matching methods, which mostly rely on matching annotations and/or rules developed by domain experts. Despite their relatively good performance, these methods are by nature difficult to generalize or scale up (e.g., the matching rules need to be updated, modified, or extended when switching from one type of code to another). There is, thus, a need for semantic matching approaches that are more generalizable and scalable. To address this need, this paper proposes a new, machine learning-based approach to automatically match the building-code concepts and relations to their equivalent concepts and relations in the Industry Foundation Classes (IFC). The proposed approach consists of five primary tasks: (1) prepare and process the training and testing data; (2) automatically identify the domain word embeddings by learning from a large corpus of building-code text and generate the final semantic representations by combining the domain and general word embeddings; (3) match the building-code concepts to the IFC elements; (4) match the building-code relations to the IFC relations; and (5) evaluate the performance of the proposed approach using accuracy. The proposed approach was implemented and tested on a number of chapters from the 2009 International Building Code (IBC) and the Champaign 2015 IBC Amendments. The preliminary results show that the proposed approach achieved an accuracy of 77% for matching building-code concepts to IFC elements, and 78% for matching building-code relations to IFC relations, indicating promising semantic matching performance.