Development of a Water Quality Event Detection and Diagnosis Framework in Drinking Water Distribution Systems with Structured and Unstructured Data Integration

Recently, various detection approaches that identify anomalous events (e.g., discoloration, contamination) by analyzing data collected from smart meters (so-called structured data) have been developed for many water distribution systems (WDSs). However, although some of them have showed promising results, meters often fail to collect/transmit the data (i.e., missing data) thus meaning that these methods may frequently not work for anomaly identification. Thus, the clear next step is to combine structured data with another type of data, unstructured data, that has no structural format (e.g., textual content, images, and colors) and can often be expressed through various social media platforms. However, no previous work has been carried out in this regard. This study proposes a framework that combines structured and unstructured data to identify WDS water quality events by collecting turbidity data (structured data) and text data uploaded to social networking services (SNSs) (unstructured data). In the proposed framework, water quality events are identified by applying data-driven detection tools for the structured data and cosine similarity for the unstructured data. The results indicate that structured data-driven tools successfully detect accidents with large magnitudes but fail to detect small failures. When the proposed framework is used, those undetected accidents are successfully identified. Thus, combining structured and unstructured data is necessary to maximize WDS water quality event detection.

[1]  Zukang Hu,et al.  Integrated data-driven framework for anomaly detection and early warning in water distribution system , 2022, Journal of Cleaner Production.

[2]  C. Biggs,et al.  Impacts of temperature and hydraulic regime on discolouration and biofilm fouling in drinking water distribution systems , 2022, PLOS Water.

[3]  T. Joubert,et al.  A Bibliometric Analysis and Review of Resource Management in Internet of Water Things: The Use of Game Theory , 2022, Water.

[4]  Zukang Hu,et al.  Multi-objective and risk-based optimal sensor placement for leak detection in a water distribution system , 2022, Environmental Technology & Innovation.

[5]  T. Joubert,et al.  A Bibliometric Analysis and Comprehensive Review of Resource Management Challenges in Internet of Things Networks: The Use of Deep Learning , 2022, IEEE Access.

[6]  Donghwi Jung,et al.  Comparison of Imputation Methods for End-User Demands in Water Distribution Systems , 2021, Journal of Water Resources Planning and Management.

[7]  Chantana Chantrapornchai,et al.  Anomaly Detection Using a Sliding Window Technique and Data Imputation with Machine Learning for Hydrological Time Series , 2021, Water.

[8]  Y. Pei,et al.  A Comparative Study of Electroanalytical Methods for Detecting Manganese in Drinking Water Distribution Systems , 2021, Electrocatalysis.

[9]  Frederik Rehbach,et al.  A novel dynamic multi-criteria ensemble selection mechanism applied to drinking water quality anomaly detection. , 2020, The Science of the total environment.

[10]  Csaba Hős,et al.  Vulnerability analysis of water distribution networks to accidental pipe burst. , 2020, Water research.

[11]  Gustavious P. Williams,et al.  Exploiting Earth Observation Data to Impute Groundwater Level Measurements with an Extreme Learning Machine , 2020, Remote. Sens..

[12]  M. Ehsan Shafiee,et al.  Streaming Smart Meter Data Integration to Enable Dynamic Demand Assignment for Real-Time Hydraulic Simulation , 2020, Journal of Water Resources Planning and Management.

[13]  Chi Zhang,et al.  Optimal sensor placement for pipe burst detection in water distribution systems using cost–benefit analysis , 2020 .

[14]  Peng Wang,et al.  An integrated data-driven framework for surface water quality anomaly detection and early warning , 2020, Journal of Cleaner Production.

[15]  Jiada Li,et al.  Rethinking the Framework of Smart Water System: A Review , 2020, Water.

[16]  Jonas Kjeld Kirstein,et al.  A case study on the effect of smart meter sampling intervals and gap-filling approaches on water distribution network simulations , 2020 .

[17]  Donghwi Jung,et al.  Hybrid Statistical Process Control Method for Water Distribution Pipe Burst Detection , 2019, Journal of Water Resources Planning and Management.

[18]  Yao-Jan Wu,et al.  Hybrid data‐driven approach for truck travel time imputation , 2019, IET Intelligent Transport Systems.

[19]  Monks,et al.  Revealing Unreported Benefits of Digital Water Metering: Literature Review and Expert Opinions , 2019, Water.

[20]  Kamal Medjaher,et al.  Model selection to improve multiple imputation for handling high rate missingness in a water quality dataset , 2019, Expert Syst. Appl..

[21]  Segun O. Olatinwo,et al.  Energy Efficient Solutions in Wireless Sensor Systems for Water Quality Monitoring: A Review , 2019, IEEE Sensors Journal.

[22]  Thomas Backhaus,et al.  Future water quality monitoring: improving the balance between exposure and toxicity assessments of real-world pollutant mixtures , 2019, Environmental Sciences Europe.

[23]  Fitore Muharemi,et al.  Machine learning approaches for anomaly detection of water quality on a real-world data set* , 2019, J. Inf. Telecommun..

[24]  Daniel Worm,et al.  Optimal placement of imperfect water quality sensors in water distribution networks , 2019, Comput. Chem. Eng..

[25]  Kiran Adnan,et al.  Limitations of information extraction methods and techniques for heterogeneous unstructured big data , 2019, International Journal of Engineering Business Management.

[26]  Andrea Castelletti,et al.  Integrated intelligent water-energy metering systems and informatics: Visioning a digital multi-utility service provider , 2018, Environ. Model. Softw..

[27]  Joong Hoon Kim,et al.  Robust meter network for water distribution pipe burst detection , 2017 .

[28]  Avi Ostfeld,et al.  Characterizing Cyber-Physical Attacks on Water Distribution Systems , 2017 .

[29]  Zoran Kapelan,et al.  Statistical Process Control Based System for Approximate Location of Pipe Bursts and Leaks in Water Distribution Systems , 2017 .

[30]  Rodney Anthony Stewart,et al.  Smart meter enabled informatics for economically efficient diversified water supply infrastructure planning , 2016 .

[31]  Jatinderkumar R. Saini,et al.  Stop-Word Removal Algorithm and its Implementation for Sanskrit Language , 2016 .

[32]  Do Guen Yoo,et al.  Uncertainty quantification of pressure-driven analysis for water distribution network modeling , 2016 .

[33]  Hwasoo Yeo,et al.  Data-Driven Imputation Method for Traffic Data in Sectional Units of Road Links , 2016, IEEE Transactions on Intelligent Transportation Systems.

[34]  Joby Boxall,et al.  Automated Data-Driven Approaches to Evaluating and Interpreting Water Quality Time Series Data from Water Distribution Systems , 2015 .

[35]  K. Lansey,et al.  Improving the rapidity of responses to pipe burst in water distribution systems: a comparison of statistical process control methods , 2015 .

[36]  Do Guen Yoo,et al.  Applications of network analysis and multi-objective genetic algorithm for selecting optimal water quality sensor locations in water distribution networks , 2015 .

[37]  Guoyin Wang,et al.  A survey of smart water quality monitoring system , 2015, Environmental Science and Pollution Research.

[38]  Jian Zhang,et al.  Online Monitoring of Water-Quality Anomaly in Water Distribution Systems Based on Probabilistic Principal Component Analysis by UV-Vis Absorption Spectroscopy , 2014 .

[39]  Marios M. Polycarpou,et al.  A Low-Cost Sensor Network for Real-Time Monitoring and Contamination Detection in Drinking Water Distribution Systems , 2014, IEEE Sensors Journal.

[40]  Juhwan Kim,et al.  Water Distribution Operation Systems Based on Smart Meter and Sensor Network , 2014 .

[41]  John Machell,et al.  Water quality event detection and customer complaint clustering analysis in distribution systems , 2012 .

[42]  Joby Boxall,et al.  Field studies of discoloration in water distribution systems: model verification and practical implications. , 2010 .

[43]  Milan Onderka,et al.  Prediction of Water Quality in the Danube River Under extreme Hydrological and Temperature Conditions , 2009 .

[44]  J H G Vreeburg,et al.  Discolouration in potable water distribution systems: a review. , 2007, Water research.

[45]  Joby Boxall,et al.  Modeling Discoloration in Potable Water Distribution Systems , 2005 .

[46]  Joby Boxall,et al.  Aggressive flushing for discolouration event mitigation in water distribution networks , 2003 .

[47]  M. Polychronopolous,et al.  Investigation of factors contributing to dirty water events in reticulation systems and evaluation of flushing methods to remove deposited particles , 2003 .

[48]  Akiko Aizawa,et al.  An information-theoretic perspective of tf-idf measures , 2003, Inf. Process. Manag..

[49]  Maciej Ceglowski,et al.  Semantic Search of Unstructured Data using Contextual Network Graphs , 2003 .