Exploring the Effectiveness of Clustering Algorithms for Capturing Water Consumption Behavior at Household Level

As water scarcity becomes more prevalent, the analysis of urban water consumption patterns at the consumer level and the estimation of the corresponding water demand for water utility are expected to be among the top priorities of water companies in the near future. This study proposes a comprehensive methodology for water managers to achieve an efficient operation of urban water networks, by successfully detecting residential water consumption patterns corresponding to different household needs and behaviors. The methodology uses Self Organizing Maps as the main clustering algorithm in combination with K-means and Hierarchical Agglomerative Clustering. The objective is to create clusters in a literature dataset that includes water consumption from 21 customers located in Milford, Ohio, USA, for a 7-month period. Originally, water consumption data was recorded for every water use incident in the household, while for this analysis, the information is converted to half-hourly water consumption. Individual customers with similar consumption behavior are clustered and water-consumption curves are calculated for each cluster; these curves can be used by the water utility to obtain estimates of the spatio-temporal distribution of demand, thus giving insight into peak demands at different locations. Statistical indices of agreement are used to confirm a good agreement between the estimated and observed water use, when clustering is employed. The resulting curves show a clear improvement in capturing water consumption behavior at household level, when compared to corresponding curves obtained without clustering. This analysis offers water utilities an innovative solution that relies on real time data and uses data science principles for optimizing water supply and network operation and provides tools for the efficient use of water resources.

[1]  N. Mellios,et al.  Urban Water Demand Forecasting for the Island of Skiathos , 2014 .

[2]  A. Parant [World population prospects]. , 1990, Futuribles.

[3]  Holger R. Maier,et al.  Input determination for neural network models in water resources applications. Part 2. Case study: forecasting salinity in a river , 2005 .

[4]  Damminda Alahakoon,et al.  Multi-granular electricity consumer load profiling for smart homes using a scalable big data algorithm , 2018, Sustainable Cities and Society.

[5]  Michael Conlon,et al.  A clustering approach to domestic electricity load profile characterisation using smart metering data , 2015 .

[6]  Esa Alhoniemi,et al.  Clustering of the self-organizing map , 2000, IEEE Trans. Neural Networks Learn. Syst..

[7]  Kuolin Hsu,et al.  Self‐organizing linear output map (SOLO): An artificial neural network suitable for hydrologic modeling and analysis , 2002 .

[8]  Philip J. Sallis,et al.  Self-organising map methods in integrated modelling of environmental and economic systems , 2006, Environ. Model. Softw..

[9]  Leandros Tassiulas,et al.  Exploring Patterns in Water Consumption by Clustering , 2015 .

[10]  Keita Yagi,et al.  Customer segmentation and visualization by combination of self-organizing map and cluster analysis , 2015, 2015 13th International Conference on ICT and Knowledge Engineering (ICT & Knowledge Engineering 2015).

[11]  null null,et al.  Artificial Neural Networks in Hydrology. II: Hydrologic Applications , 2000 .

[12]  Rodney Anthony Stewart,et al.  Enhancing Residential Water End Use Pattern Recognition Accuracy Using Self-Organizing Maps and K-Means Clustering Techniques: Autoflow v3.1 , 2018, Water.

[13]  Holger R. Maier,et al.  Neural networks for the prediction and forecasting of water resource variables: a review of modelling issues and applications , 2000, Environ. Model. Softw..

[14]  Holger R. Maier,et al.  Determining Inputs for Neural Network Models of Multivariate Time Series , 1997 .

[15]  Gwo-Fong Lin,et al.  Identification of homogeneous regions for regional frequency analysis using the self-organizing map , 2006 .

[16]  Holger R. Maier,et al.  Input determination for neural network models in water resources applications. Part 1—background and methodology , 2005 .

[17]  Tomasz Jach,et al.  Domestic water consumption monitoring and behaviour intervention by employing the internet of things technologies , 2017 .

[18]  Mikko Kolehmainen,et al.  Data-based method for creating electricity use load profiles using large amount of customer-specific hourly measured electricity use data , 2010 .

[19]  Christian W. Dawson,et al.  Hydrological modelling using artificial neural networks , 2001 .

[20]  Zita A. Vale,et al.  A Comparative Analysis of Clustering Algorithms Applied to Load Profiling , 2003, MLDM.

[21]  Ronny Berndtsson,et al.  Interpolating monthly precipitation by self-organizing map (SOM) and multilayer perceptron (MLP) , 2007 .

[22]  Y. Hong,et al.  Self‐organizing nonlinear output (SONO): A neural network suitable for cloud patch–based rainfall estimation at small scales , 2005 .

[23]  A. M. Kalteh,et al.  Review of the self-organizing map (SOM) approach in water resources: Analysis, modelling and application , 2008, Environ. Model. Softw..

[24]  Panagiotis D. Ritsos,et al.  Water4Cities: An ICT Platform Enabling Holistic Surface Water and Groundwater Management for Sustainable Cities , 2018, Proceedings.

[25]  Anna M. Makles,et al.  Stata Tip 110: How to Get the Optimal K-Means Cluster Solution , 2012 .

[26]  Olli Simula,et al.  Process Monitoring and Modeling Using the Self-Organizing Map , 1999, Integr. Comput. Aided Eng..

[28]  Dimitris Kofinas,et al.  A methodology for synthetic household water consumption data generation , 2018, Environ. Model. Softw..

[29]  R. Abrahart,et al.  Comparing neural network and autoregressive moving average techniques for the provision of continuous river flow forecasts in two contrasting catchments , 2000 .

[30]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[31]  Antti Mutanen,et al.  Customer Classification and Load Profiling Method for Distribution Systems , 2011, IEEE Transactions on Power Delivery.

[32]  Niels Schütze,et al.  Self‐organizing maps with multiple input‐output option for modeling the Richards equation and its inverse solution , 2005 .

[33]  T. Kohonen Analysis of a simple self-organizing process , 1982, Biological Cybernetics.

[34]  Liem T. Tran,et al.  Self-Organizing Maps for Integrated Environmental Assessment of the Mid-Atlantic Region , 2003, Environmental management.

[35]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[36]  G. Blain,et al.  On the performance of three indices of agreement: an easy-to-use r-code for calculating the Willmott indices , 2018, Bragantia.

[37]  N. Null Artificial Neural Networks in Hydrology. I: Preliminary Concepts , 2000 .

[38]  H. Maier,et al.  The Use of Artificial Neural Networks for the Prediction of Water Quality Parameters , 1996 .