Water quality change detection: multivariate algorithms

In light of growing concern over the safety and security of our nation's drinking water, increased attention has been focused on advanced monitoring of water distribution systems. The key to these advanced monitoring systems lies in the combination of real time data and robust statistical analysis. Currently available data streams from sensors provide near real time information on water quality. Combining these data streams with change detection algorithms, this project aims to develop automated monitoring techniques that will classify real time data and denote anomalous water types. Here, water quality data in 1 hour increments over 3000 hours at 4 locations are used to test multivariate algorithms to detect anomalous water quality events. The algorithms use all available water quality sensors to measure deviation from expected water quality. Simulated anomalous water quality events are added to the measured data to test three approaches to measure this deviation. These approaches include multivariate distance measures to 1) the previous observation, 2) the closest observation in multivariate space, and 3) the closest cluster of previous water quality observations. Clusters are established using kmeans classification. Each approach uses a moving window of previous water quality measurements to classify the current measurement as normal or anomalous. Receiver Operating Characteristic (ROC) curves test the ability of each approach to discriminate between normal and anomalous water quality using a variety of thresholds and simulated anomalous events. These analyses result in a better understanding of the deviation from normal water quality that is necessary to sound an alarm.