Prediction of Sewer Pipe Deterioration Using Random Forest Classification

Wastewater infrastructure systems deteriorate over time due to a combination of physical and chemical factors. Failure of this significant infrastructure could affect important social, environmental, and economic impacts. Furthermore, recognizing the optimized timeline for inspection of sewer pipelines are challenging tasks for the utility managers and other authorities. Regular examination of sewer networks is not cost-effective due to limited time and high cost of assessment technologies and a large inventory of pipes. To avoid such obstacles, various researchers endeavored to improve infrastructure condition assessment methodologies to maintain sewer pipe systems at the desired condition. Sewer condition prediction models are developed to provide a framework to forecast the future condition of pipes to schedule inspection frequencies. The main goal of this study is to develop a predictive model for wastewater pipes using random forest classification. Predictive models can effectively predict sewer pipe condition and can increase the certainty level of the predictive results and decrease uncertainty in the current condition of wastewater pipes. The developed random forest classification model has achieved a stratified test set false negative rate, the false positive rate, and an excellent area under the ROC curve of 0.81 in a case study application for the City of LA, California. An area under the ROC curve > 0.80 indicates the developed model is an "excellent" choice for predicting the condition of individual pipes in a sewer network. The deterioration models can be used in the industry to improve the inspection timeline and maintenance planning.

[1]  Mariana Belgiu,et al.  Random forest in remote sensing: A review of applications and future directions , 2016 .

[2]  Mary Catherine Opila Structural condition scoring of buried sewer pipes for risk-based decision making , 2011 .

[3]  HarveyRobert Richard,et al.  Predicting the structural condition of individual sanitary sewer pipes with random forests , 2014 .

[4]  W. Bauwens,et al.  Modeling the structural deterioration of urban drainage pipes: the state-of-the-art in statistical methods , 2010 .

[5]  Carlos Dafonte,et al.  Mixing numerical and categorical data in a Self-Organizing Map by means of frequency neurons , 2015, Appl. Soft Comput..

[6]  James H. Garrett,et al.  Spatial data management and analysis in sewer systems' condition assessment: An overview , 2007 .

[7]  John Mashford,et al.  Prediction of Sewer Condition Grade Using Support Vector Machines , 2011, J. Comput. Civ. Eng..

[8]  Qing Han,et al.  Toward An Integrated Approach to Localizing Failures in Community Water Networks , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[9]  Philipp Probst,et al.  Hyperparameters and tuning strategies for random forest , 2018, WIREs Data Mining Knowl. Discov..

[10]  Anka Lisec,et al.  Estimating the Performance of Random Forest versus Multiple Regression for Predicting Prices of the Apartments , 2018, ISPRS Int. J. Geo Inf..

[11]  Wei Guo,et al.  Sparse-TDA: Sparse Realization of Topological Data Analysis for Multi-Way Classification , 2018, IEEE Transactions on Knowledge and Data Engineering.

[12]  Amir M. Alani,et al.  Reliability based life cycle cost optimization for underground pipeline networks , 2014 .

[13]  Saad Bennis,et al.  Cost Optimization of Hydraulic and Structural Rehabilitation of Urban Drainage Network , 2014 .

[14]  Zheng Liu,et al.  Classification of defects with ensemble methods in the automated visual inspection of sewer pipes , 2015, Pattern Analysis and Applications.

[15]  Alberto Ferruccio Piccinni,et al.  Preventive Approach to Reduce Risk Caused by Failure of a Rainwater Drainage System: The Case Study of Corato (Southern Italy) , 2017, ICCSA.

[16]  Solomon Tesfamariam,et al.  Statistical Inference of Sewer Pipe Deterioration Using Bayesian Geoadditive Regression Model , 2019, Journal of Infrastructure Systems.

[17]  Richard Simon,et al.  Microarray-based cancer prediction using single genes , 2011, BMC Bioinformatics.

[18]  Pablo Cortés,et al.  Prediction of pipe failures in water supply networks using logistic regression and support vector classification , 2020, Reliab. Eng. Syst. Saf..

[19]  Bingsheng He,et al.  Efficient Gradient Boosted Decision Tree Training on GPUs , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[20]  Colin S. Chung,et al.  Decision Tree–Based Deterioration Model for Buried Wastewater Pipelines , 2013 .

[21]  Jurg Keller,et al.  Evaluation of data-driven models for predicting the service life of concrete sewer pipes subjected to corrosion. , 2019, Journal of environmental management.

[22]  Sophie Duchesne,et al.  A Survival Analysis Model for Sewer Pipe Structural Deterioration , 2013, Comput. Aided Civ. Infrastructure Eng..

[23]  Conceição Amado,et al.  A Random Forest Algorithm Applied to Condition-based Wastewater Deterioration Modeling and Forecasting , 2014 .

[24]  Shion Guha,et al.  Machine Learning and Grounded Theory Method: Convergence, Divergence, and Combination , 2016, GROUP.

[25]  I. Mellin,et al.  Sewer Condition Prediction and Analysis of Explanatory Factors , 2018, Water.

[26]  J. P. Davies,et al.  Factors influencing the structural deterioration and collapse of rigid sewer pipes , 2001 .

[27]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[28]  Roberta E. Martin,et al.  A Tale of Two “Forests”: Random Forest Machine Learning Aids Tropical Forest Carbon Mapping , 2014, PloS one.

[29]  Baris Salman,et al.  Infrastructure Management and Deterioration Risk Assessment of Wastewater Collection Systems , 2010 .

[30]  Muhammad Safeer Khan An approach for crack detection in sewer pipes using acoustic signals , 2017, 2017 IEEE Global Humanitarian Technology Conference (GHTC).

[31]  Guanghui Niu,et al.  Classification of iron ores by laser-induced breakdown spectroscopy (LIBS) combined with random forest (RF) , 2015 .

[32]  Carolin Strobl,et al.  Unbiased split selection for classification trees based on the Gini Index , 2007, Comput. Stat. Data Anal..

[33]  J. Evans,et al.  Gradient modeling of conifer species using random forests , 2009, Landscape Ecology.

[34]  Kai Ming Ting,et al.  Confusion Matrix , 2010, Encyclopedia of Machine Learning and Data Mining.

[35]  James H. Garrett,et al.  Application of Classification Models and Spatial Clustering Analysis to a Sewage Collection System of a Mid-Sized City , 2012 .

[36]  Kevin E Lansey,et al.  Scenario planning to address critical uncertainties for robust and resilient water-wastewater infrastructures under conditions of water scarcity and rapid development , 2012 .

[37]  Max Kuhn,et al.  Applied Predictive Modeling , 2013 .

[38]  John C. Matthews,et al.  Wastewater Pipe Condition Rating Model Using Multicriteria Decision Analysis , 2019 .

[39]  R. A. Al-Ani,et al.  Prediction of Sediment Accumulation Model for Trunk Sewer Using Multiple Linear Regression and Neural Network Techniques , 2019, Civil Engineering Journal.