Analyzing Asset Management Data Using Data and Text Mining

Predictive models using text from a sample competitively bid California highway projects have been used to predict a construction projects likely level of cost overrun. A text description of the project and the text of the five largest project line items were used as input. The text data were converted to numerical attributes using text-mining algorithms and singular value decomposition. Two models were produced. The first used only the text description as input, while the second combined the text data with the numeric value of the low bid. Classification models were produced using the K-Star classification algorithm. Modeling results indicated information in the textual descriptions is related to the projects level of cost overrun.

[1]  Changmin Kim,et al.  Hybrid principal component analysis and support vector machine model for predicting the cost performance of commercial building projects using pre-project planning variables , 2012 .

[2]  Carlos H. Caldas,et al.  Automating hierarchical document classification for construction management information systems , 2003 .

[3]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[4]  Pabitra Mitra,et al.  A comparative study on feature reduction approaches in Hindi and Bengali named entity recognition , 2012, Knowl. Based Syst..

[5]  John Elder,et al.  Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications , 2012 .

[6]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[7]  Tong Zhang,et al.  Fundamentals of Predictive Text Mining , 2010, Texts in Computer Science.

[8]  John G. Cleary,et al.  K*: An Instance-based Learner Using and Entropic Distance Measure , 1995, ICML.

[9]  Ovidiu Ivanciuc,et al.  Applications of Support Vector Machines in Chemistry , 2007 .

[10]  Sophia Ananiadou,et al.  Text Mining for Biology And Biomedicine , 2005 .

[11]  Keith R. Molenaar,et al.  Construction Project Cost Escalation Factors , 2009 .

[12]  Lucio Soibelman,et al.  Data Preparation Process for Construction Knowledge Generation through Knowledge Discovery in Databases , 2002 .

[13]  Carlos H. Caldas,et al.  Management and analysis of unstructured construction data types , 2008, Adv. Eng. Informatics.