A flexible smoother adapted to censored data with outliers and its application to SARS-CoV-2 monitoring in wastewater

A sentinel network, Obépine, has been designed to monitor SARS-CoV-2 viral load in wastewaters arriving at wastewater treatment plants (WWTPs) in France as an indirect macroepidemiological parameter. The sources of uncertainty in such monitoring system are numerous and the concentration measurements it provides are left-censored and contain outliers, which biases the results of usual smoothing methods. Hence the need for an adapted pre-processing in order to evaluate the real daily amount of virus arriving to each WWTP. We propose a method based on an auto-regressive model adapted to censored data with outliers. Inference and prediction are produced via a discretised smoother which makes it a very flexible tool. This method is both validated on simulations and on real data from Obépine. The resulting smoothed signal shows a good correlation with other epidemiological indicators and is currently used by Obépine to provide an estimate of virus circulation over the watersheds corresponding to about 200 WWTPs.

[1]  Rudolph van der Merwe,et al.  The unscented Kalman filter for nonlinear estimation , 2000, Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium (Cat. No.00EX373).

[2]  Michael J. Piovoso,et al.  Nonlinear estimators for censored data: A comparison of the EKF, the UKF and the Tobit Kalman filter , 2015, 2015 American Control Conference (ACC).

[3]  Lennart Ljung,et al.  Generalized Kalman smoothing: Modeling and algorithms , 2016, Autom..

[4]  H.F. Durrant-Whyte,et al.  A new approach for filtering nonlinear systems , 1995, Proceedings of 1995 American Control Conference - ACC'95.

[5]  Wolfgang Rauch,et al.  Data filtering methods for SARS-CoV-2 wastewater surveillance. , 2021, Water science and technology : a journal of the International Association on Water Pollution Research.

[6]  D. Leleux,et al.  Applications of Kalman filtering to real-time trace gas concentration measurements , 2002 .

[7]  L. Moulin,et al.  A nationwide indicator to smooth and normalize heterogeneous SARS-CoV-2 RNA data in wastewater , 2021, Environment International.

[8]  B. Desjardins,et al.  The effects of heterogeneity and stochastic variability of behaviours on the intrinsic dynamics of epidemics , 2021, medRxiv.

[9]  Giovanni Petris,et al.  An R Package for Dynamic Linear Models , 2010 .

[10]  Michael J. Piovoso,et al.  The Tobit Kalman Filter: An Estimator for Censored Measurements , 2016, IEEE Transactions on Control Systems Technology.

[11]  D. Champredon,et al.  A wastewater-based epidemic model for SARS-CoV-2 with application to three Canadian cities , 2021, Epidemics.

[12]  Andrew W. Moore,et al.  Locally Weighted Learning , 1997, Artificial Intelligence Review.

[13]  L. Moulin,et al.  Evaluation of lockdown effect on SARS-CoV-2 dynamics through viral genome quantification in waste water, Greater Paris, France, 5 March to 23 April 2020 , 2020, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[14]  Michael J. Piovoso,et al.  Modelling HIV-1 2-LTR dynamics following raltegravir intensification , 2013, Journal of The Royal Society Interface.

[15]  S. Luby,et al.  Uncertainties in estimating SARS-CoV-2 prevalence by wastewater-based epidemiology , 2021, Chemical Engineering Journal.

[16]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[17]  R. E. Kalman,et al.  New Results in Linear Filtering and Prediction Theory , 1961 .

[18]  S. Haykin Kalman Filtering and Neural Networks , 2001 .

[19]  Mark Ilg,et al.  Kalman filter-based tracking of multiple similar objects from a moving camera platform , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[20]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .