Unsupervised Anomaly Detection in Multivariate Spatio-Temporal Data Using Deep Learning: Early Detection of COVID-19 Outbreak in Italy

Unsupervised anomaly detection for spatio-temporal data has extensive use in a wide variety of applications such as earth science, traffic monitoring, fraud and disease outbreak detection. Most real-world time series data have a spatial dimension as an additional context which is often expressed in terms of coordinates of the region of interest (such as latitude - longitude information). However, existing techniques are limited to handle spatial and temporal contextual attributes in an integrated and meaningful way considering both spatial and temporal dependency between observations. In this paper, a hybrid deep learning framework is proposed to solve the unsupervised anomaly detection problem in multivariate spatio-temporal data. The proposed framework works with unlabeled data and no prior knowledge about anomalies are assumed. As a case study, we use the public COVID-19 data provided by the Italian Department of Civil Protection. Northern Italy regions’ COVID-19 data are used to train the framework; and then any abnormal trends or upswings in COVID-19 data of central and southern Italian regions are detected. The proposed framework detects early signals of the COVID-19 outbreak in test regions based on the reconstruction error. For performance comparison, we perform a detailed evaluation of 15 algorithms on the COVID-19 Italy dataset including the state-of-the-art deep learning architectures. Experimental results show that our framework shows significant improvement on unsupervised anomaly detection performance even in data scarce and high contamination ratio scenarios (where the ratio of anomalies in the data set is more than 5%). It achieves the earliest detection of COVID-19 outbreak and shows better performance on tracking the peaks of the COVID-19 pandemic in test regions. As the timeliness of detection is quite important in the fight against any outbreak, our framework provides useful insight to suppress the resurgence of local novel coronavirus outbreaks as early as possible.

[1]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[2]  P. Bajardi,et al.  COVID-19 outbreak response, a dataset to assess mobility changes in Italy following national lockdown , 2020, Scientific Data.

[3]  Qin Yu,et al.  An Improved ARIMA-Based Traffic Anomaly Detection Algorithm for Wireless Sensor Networks , 2016, Int. J. Distributed Sens. Networks.

[4]  Viorica Patraucean,et al.  Spatio-temporal video autoencoder with differentiable memory , 2015, ArXiv.

[5]  Pang-Ning Tan,et al.  Detection and Characterization of Anomalies in Multivariate Time Series , 2009, SDM.

[6]  Aleksandar Lazarevic,et al.  Incremental Local Outlier Detection for Data Streams , 2007, 2007 IEEE Symposium on Computational Intelligence and Data Mining.

[7]  Chang-Tien Lu,et al.  Algorithms for spatial outlier detection , 2003, Third IEEE International Conference on Data Mining.

[8]  Shih-Fu Chang,et al.  Modeling Multimodal Clues in a Hybrid Deep Learning Framework for Video Classification , 2017, IEEE Transactions on Multimedia.

[9]  Shehroz S. Khan,et al.  DeepFall: Non-Invasive Fall Detection with Deep Spatio-Temporal Convolutional Autoencoders , 2019, Journal of Healthcare Informatics Research.

[10]  Lovekesh Vig,et al.  LSTM-based Encoder-Decoder for Multi-sensor Anomaly Detection , 2016, ArXiv.

[11]  Jonghyun Choi,et al.  Learning Temporal Regularity in Video Sequences , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Raghavendra Chalapathy University of Sydney,et al.  Deep Learning for Anomaly Detection: A Survey , 2019, ArXiv.

[13]  Subutai Ahmad,et al.  Unsupervised real-time anomaly detection for streaming data , 2017, Neurocomputing.

[14]  Kenji Yamanishi,et al.  A unifying framework for detecting outliers and change points from time series , 2006, IEEE Transactions on Knowledge and Data Engineering.

[15]  anonymous,et al.  Strategies for the surveillance of COVID-19 , 2020 .

[16]  Nuno R. Faria,et al.  The effect of human mobility and control measures on the COVID-19 epidemic in China , 2020, Science.

[17]  James M. Hyman,et al.  New coronavirus outbreak: Framing questions for pandemic prevention , 2020, Science Translational Medicine.

[18]  Dean F. Sittig,et al.  The emerging science of very early detection of disease outbreaks. , 2001, Journal of public health management and practice : JPHMP.

[19]  Shi-Jinn Horng,et al.  Deep Air Quality Forecasting Using Hybrid Deep Learning Framework , 2018, IEEE Transactions on Knowledge and Data Engineering.

[20]  Takehisa Yairi,et al.  Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction , 2014, MLSDA'14.

[21]  Yong Haur Tay,et al.  Abnormal Event Detection in Videos using Spatiotemporal Autoencoder , 2017, ISNN.

[22]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[23]  Lian Duan,et al.  A Local Density Based Spatial Clustering Algorithm with Noise , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[24]  M. Sahimi,et al.  How do environmental, economic and health factors influence regional vulnerability to COVID-19? , 2020, medRxiv.

[25]  Charu C. Aggarwal,et al.  Time Series and Multidimensional Streaming Outlier Detection , 2013 .

[26]  Derya Birant,et al.  Spatio-temporal outlier detection in large databases , 2006, 28th International Conference on Information Technology Interfaces, 2006..

[27]  Pejman Tahmasebi,et al.  Multiple Point Statistics: A Review , 2018 .

[28]  Jing Li,et al.  Early Prediction of the 2019 Novel Coronavirus Outbreak in the Mainland China Based on Simple Mathematical Model , 2020, IEEE Access.

[29]  Shawn N Murphy,et al.  Semi-supervised Encoding for Outlier Detection in Clinical Observation Data , 2018, bioRxiv.

[30]  Amit Mitra Control Charts for Variables , 2012 .

[31]  Davide Cozzolino,et al.  Autoencoder with recurrent neural networks for video forgery detection , 2017, Media Watermarking, Security, and Forensics.

[32]  Charu C. Aggarwal,et al.  Outlier Detection for Temporal Data: A Survey , 2014, IEEE Transactions on Knowledge and Data Engineering.

[33]  Pascal Vincent,et al.  Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[34]  M. Peirlinck,et al.  The reproduction number of COVID-19 and its correlation with public health interventions , 2020, Computational Mechanics.

[35]  Ana Bianco,et al.  Outlier Detection in Regression Models with ARIMA Errors Using Robust Estimates , 2001 .

[36]  Subutai Ahmad,et al.  Evaluating Real-Time Anomaly Detection Algorithms -- The Numenta Anomaly Benchmark , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[37]  Minyi Guo,et al.  Unsupervised Extraction of Video Highlights via Robust Recurrent Auto-Encoders , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[38]  Pemetaan Jumlah Balita,et al.  Spatial Scan Statistic , 2014, Encyclopedia of Social Network Analysis and Mining.

[39]  Gregory F. Cooper,et al.  A multivariate Bayesian scan statistic for early event detection and characterization , 2010, Machine Learning.

[40]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Zhilin Li,et al.  A Multiscale Approach for Spatio‐Temporal Outlier Detection , 2006, Trans. GIS.

[42]  Shashi Shekhar,et al.  A Unified Approach to Detecting Spatial Outliers , 2003, GeoInformatica.

[43]  Sanjay Chawla,et al.  On local spatial outliers , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[44]  Charu C. Aggarwal,et al.  Proximity-Based Outlier Detection , 2013 .

[45]  Raja Giryes,et al.  Autoencoders , 2020, ArXiv.

[46]  Q. Cheng,et al.  Handbook of Mathematical Geosciences: Fifty Years of IAMG , 2018 .

[47]  J. Ma,et al.  Time-series novelty detection using one-class support vector machines , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[48]  B. Ravi Kiran,et al.  An overview of deep learning based methods for unsupervised and semi-supervised anomaly detection in videos , 2018, J. Imaging.

[49]  Charu C. Aggarwal Linear Models for Outlier Detection , 2013 .

[50]  Giovanni De Magistris,et al.  Spatio-Temporal Anomaly Detection for Industrial Robots through Prediction in Unsupervised Feature Space , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[51]  Mubarak Shah,et al.  Real-World Anomaly Detection in Surveillance Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[52]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[53]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[54]  Eleazar Eskin,et al.  A GEOMETRIC FRAMEWORK FOR UNSUPERVISED ANOMALY DETECTION: DETECTING INTRUSIONS IN UNLABELED DATA , 2002 .

[55]  Charu C. Aggarwal Spatial Outlier Detection , 2013 .

[56]  Zengyou He,et al.  Discovering cluster-based local outliers , 2003, Pattern Recognit. Lett..

[57]  Cosimo Distante,et al.  Forecasting Covid-19 Outbreak Progression in Italian Regions: A model based on neural network training from Chinese data , 2020, medRxiv.

[58]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[59]  Boleslaw K. Szymanski,et al.  FUZZY ROC CURVES FOR THE 1 CLASS SVM: APPLICATION TO INTRUSION DETECTION , 2005 .

[60]  George Barbastathis,et al.  Neural Network aided quarantine control model estimation of COVID spread in Wuhan, China , 2020 .

[61]  Nathalie Japkowicz,et al.  Nonlinear Autoassociation Is Not Equivalent to PCA , 2000, Neural Computation.

[62]  Andreas Dengel,et al.  DeepAnT: A Deep Learning Approach for Unsupervised Anomaly Detection in Time Series , 2019, IEEE Access.

[63]  Vishal M. Patel,et al.  Learning Deep Features for One-Class Classification , 2018, IEEE Transactions on Image Processing.

[64]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[65]  R. Megna First month of the epidemic caused by COVID-19 in Italy: current status and real-time outbreak development forecast , 2020, Global Health Research and Policy.

[66]  Su Fong Chien,et al.  ARIMA Based Network Anomaly Detection , 2010, 2010 Second International Conference on Communication Software and Networks.

[67]  Andrew W. Moore,et al.  What's Strange About Recent Events (WSARE): An Algorithm for the Early Detection of Disease Outbreaks , 2005, J. Mach. Learn. Res..

[68]  Boleslaw K. Szymanski,et al.  Recursive data mining for masquerade detection and author identification , 2004, Proceedings from the Fifth Annual IEEE SMC Information Assurance Workshop, 2004..

[69]  Sean Andrew McKenna Statistical Parametric Mapping for Geoscience Applications , 2018 .

[70]  L. Hutwagner,et al.  The bioterrorism preparedness and response Early Aberration Reporting System (EARS) , 2003, Journal of Urban Health.

[71]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[72]  F. Amenta,et al.  COVID-19 outbreak reproduction number estimations and forecasting in Marche, Italy , 2020, International Journal of Infectious Diseases.

[73]  W. Liang,et al.  Modified SEIR and AI prediction of the epidemics trend of COVID-19 in China under public health interventions , 2020, Journal of thoracic disease.

[74]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[75]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[76]  Andrew W. Moore,et al.  Rule-based anomaly pattern detection for detecting disease outbreaks , 2002, AAAI/IAAI.

[77]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[78]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[79]  V. Ingole,et al.  Landmark based shortest path detection by using Dijkestra Algorithm and Haversine Formula , 2013 .

[80]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[81]  J. Ioannidis,et al.  What Other Countries Can Learn From Italy During the COVID-19 Pandemic. , 2020, JAMA internal medicine.

[82]  Abhishek Sharma,et al.  Context-Aware Time Series Anomaly Detection for Complex Systems , 2013 .

[83]  Pierre Baldi,et al.  Autoencoders, Unsupervised Learning, and Deep Architectures , 2011, ICML Unsupervised and Transfer Learning.

[84]  Wang Jianwei,et al.  Extended SIR Prediction of the Epidemics Trend of COVID-19 in Italy and Compared With Hunan, China , 2020, Frontiers in Medicine.

[85]  Fei Tony Liu,et al.  Isolation-Based Anomaly Detection , 2012, TKDD.

[86]  Xiaogang Wang,et al.  Hybrid Deep Learning for Face Verification , 2013, 2013 IEEE International Conference on Computer Vision.

[87]  R Hecht-Nielsen,et al.  Replicator neural networks for universal optimal source coding. , 1995, Science.

[88]  Lovekesh Vig,et al.  Long Short Term Memory Networks for Anomaly Detection in Time Series , 2015, ESANN.

[89]  R. Ziff,et al.  Fractal kinetics of COVID-19 pandemic , 2020, medRxiv.

[90]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[91]  Skipper Seabold,et al.  Statsmodels: Econometric and Statistical Modeling with Python , 2010, SciPy.

[92]  Shashi Shekhar,et al.  Detecting graph-based spatial outliers: algorithms and applications (a summary of results) , 2001, KDD '01.

[93]  C. Whittaker,et al.  Report 9: Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand , 2020 .