Predicting project cost overrun levels in bidding stage using ensemble learning

ABSTRACT Predicting project cost overruns in the bidding stage has undergone significant changes with the application of state-of-the-art techniques. Both modeling techniques and domain knowledge should be integrated to enhance predictions of cost performance. This study developed an ensemble-learning classification model to predict the expected cost-overrun levels of public projects and derive explanatory factors and key predictors. A database of 234 public-sector projects in South Korea was used, including project characteristics (i.e., project delivery method, project types, cost, and schedule) in combination with bidding characteristics (i.e., award method, number of bidders, bid to estimate ratio, number of joint ventures). The results yielded an average accuracy of 61.41% for five model runs. Furthermore, information on the project type being constructed is an important contributor to prediction accuracy. Results of the model enable project owners and managers to screen projects that are expected to incur excessive cost overruns and to anticipate budget loss during the bidding stage and before contracts are finalized.

[1]  John Messner,et al.  Comparing procurement methods for design-build projects , 2006 .

[2]  S. Thomas Ng,et al.  Forecast models for actual construction time and cost , 2003 .

[3]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[4]  Trefor P. Williams,et al.  ANALYZING BIDDING STATISTICS TO PREDICT COMPLETED PROJECT COST , 2005 .

[5]  Mounir El Asmar,et al.  Two Decades of Performance Comparisons for Design-Build, Construction Manager at Risk, and Design-Bid-Build: Quantitative Analysis of the State of Knowledge on Project Cost, Schedule, and Quality , 2017 .

[6]  Samuel Labi,et al.  Estimating Cost Discrepancies in Highway Contracts: Multistep Econometric Approach , 2008 .

[7]  Wen Yi,et al.  Comparing the Random Forest with the Generalized Additive Model to Evaluate the Impacts of Outdoor Ambient Environmental Factors on Scaffolding Construction Productivity , 2018, Journal of Construction Engineering and Management.

[8]  Cost Performance Comparison of Design-Build and Design-Bid-Build for Building and Civil Projects Using Mediation Analysis , 2020 .

[9]  Charles T. Jahren,et al.  Predictors of Cost‐Overrun Rates , 1990 .

[10]  R. Müller,et al.  Matching the project manager’s leadership style to project type , 2007 .

[11]  Lei Zhang,et al.  Prediction of Engineering Performance: A Neurofuzzy Approach , 2005 .

[13]  Rajeev Rastogi,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD 2000.

[14]  Brian R. Gaines,et al.  Induction of ripple-down rules applied to modeling large databases , 1995, Journal of Intelligent Information Systems.

[15]  Trefor P. Williams,et al.  Using Classification Rules to Develop a Predictive Indicator of Project Cost Overrun Potential from Bidding Data , 2007 .

[16]  Jui-Sheng Chou,et al.  Predicting Disputes in Public-Private Partnership Projects: Classification and Ensemble Models , 2013, J. Comput. Civ. Eng..

[17]  Samuel Labi,et al.  An Analysis of Cost Overruns and Time Delays of INDOT Projects , 2004 .

[18]  Jianqiang Li,et al.  Exploiting ensemble learning for automatic cataract detection and grading , 2016, Comput. Methods Programs Biomed..

[19]  Xiaoxiao Li,et al.  Comparison of Cost and Time Performance of Design-Build and Design-Bid-Build Delivery Systems in Florida , 2013 .

[20]  Florence Yean Yng Ling,et al.  Predicting Performance of Design-Build and Design-Bid-Build Projects , 2004 .

[21]  David Pérez-Castrillo,et al.  Auditing cost overrun claims , 2004 .

[22]  Abdelrahman Osman Elfaki,et al.  Using Intelligent Techniques in Construction Project Cost Estimation: 10-Year Survey , 2014 .

[23]  Awad S. Hanna,et al.  Quantifying Performance for the Integrated Project Delivery System as Compared to Established Delivery Systems , 2013 .

[24]  Jieh-Haur Chen,et al.  Assessing impacts of information technology on project success through knowledge management practice , 2012 .

[25]  Claude Sammut,et al.  Matching experts' decisions in concrete delivery dispatching centers by ensemble learning algorithms: Tactical level , 2016 .

[26]  Keith R. Molenaar,et al.  Design'Build for Water'Wastewater Facilities: State of the Industry Survey and Three Case Studies , 2004 .

[27]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[28]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[29]  Samuel Labi,et al.  Effects of bundling policy on project cost under market uncertainty: A comparison across different highway project types , 2019 .

[30]  Changmin Kim,et al.  Hybrid principal component analysis and support vector machine model for predicting the cost performance of commercial building projects using pre-project planning variables , 2012 .

[31]  Fernando A. Branco,et al.  Risk-Informed Time-Cost Relationship Models for Sanitation Projects , 2014 .

[32]  Pramen P. Shrestha,et al.  Magnitude of Construction Cost and Schedule Overruns in Public Work Projects , 2013 .

[33]  Mark Konchar,et al.  Comparison of U.S. Project Delivery Systems , 1998 .

[34]  Christopher M. Gordon,et al.  Choosing Appropriate Construction Contracting Method , 1994 .

[35]  Tarek Hegazy,et al.  Predicting cost deviation in reconstruction projects: Artificial neural networks versus regression , 2003 .

[36]  C. J. Moore,et al.  Predicted cost escalations in competitively bid highway projects , 1999 .

[37]  Trefor P. Williams Bidding ratios to predict highway project costs , 2005 .

[38]  Yu-Ren Wang,et al.  Predicting construction cost and schedule success using artificial neural networks ensemble and support vector machines classification models , 2012 .

[39]  Keith R. Molenaar,et al.  PUBLIC-SECTOR DESIGN/BUILD EVOLUTION AND PERFORMANCE , 1999 .

[40]  Samuel Labi,et al.  Three-Stage Least-Squares Analysis of Time and Cost Overruns in Construction Contracts , 2010 .

[41]  Robert A. Perkins,et al.  Sources of Changes in Design – Build Contracts for a Governmental Owner , 2009 .

[42]  Ingo Mierswa,et al.  YALE: rapid prototyping for complex data mining tasks , 2006, KDD '06.

[43]  John G. Cleary,et al.  K*: An Instance-based Learner Using and Entropic Distance Measure , 1995, ICML.

[44]  Onur Behzat Tokdemir,et al.  Comparison of Case-Based Reasoning and Artificial Neural Networks , 1999 .

[45]  Kleopatra Petroutsatou,et al.  Early Cost Estimating of Road Tunnel Construction Using Neural Networks , 2012 .

[46]  Peter E.D. Love,et al.  Influence of Project Type and Procurement Method on Rework Costs in Building Construction Projects , 2002 .

[47]  Jie Gong,et al.  Predicting construction cost overruns using text mining, numerical data and ensemble classifiers , 2014 .

[48]  Toni L. Doolen,et al.  Comparison of Predictive Cost Models for Bridge Replacement Projects , 2015 .

[49]  Zhigang Jin,et al.  Time and Cost Performance of Design–Build Projects , 2016 .

[50]  Trefor P. Williams Predicting completed project cost using bidding data , 2002 .

[51]  Simon Smith,et al.  Dealing with construction cost overruns using data mining , 2014 .

[52]  Baabak Ashuri,et al.  Prediction of Unit Price Bids of Resurfacing Highway Projects through Ensemble Machine Learning , 2018, J. Comput. Civ. Eng..

[53]  Tengfei Huo,et al.  Key Factors of Project Characteristics Affecting Project Delivery System Decision Making in the Chinese Construction Industry: Case Study Using Chinese Data Based on Rough Set Theory , 2016 .