Towards Outlier Sensor Detection in Ambient Intelligent Platforms—A Low-Complexity Statistical Approach †

Sensor networks in real-world environments, such as smart cities or ambient intelligent platforms, provide applications with large and heterogeneous sets of data streams. Outliers—observations that do not conform to an expected behavior—has then turned into a crucial task to establish and maintain secure and reliable databases in this kind of platforms. However, the procedures to obtain accurate models for erratic observations have to operate with low complexity in terms of storage and computational time, in order to attend the limited processing and storage capabilities of the sensor nodes in these environments. In this work, we analyze three binary classifiers based on three statistical prediction models—ARIMA (Auto-Regressive Integrated Moving Average), GAM (Generalized Additive Model), and LOESS (LOcal RegrESSion)—for outlier detection with low memory consumption and computational time rates. As a result, we provide (1) the best classifier and settings to detect outliers, based on the ARIMA model, and (2) two real-world classified datasets as ground truths for future research.

[1]  Li Yujian,et al.  A Comparison between ARIMA, LSTM, and GRU for Time Series Forecasting , 2019, ACAI.

[2]  B. Muthukumar,et al.  Intrusion Detection System (IDS): Anomaly Detection Using Outlier Detection Approach , 2015 .

[3]  Nikolaos Komninos,et al.  Intelligent Cities and Globalisation of Innovation Networks , 2008 .

[4]  Mostafa Hosseini,et al.  A Hybrid Approach for Anomaly Detection in the Internet of Things , 2018, SCIOT '18.

[5]  Qin Yu,et al.  An Improved ARIMA-Based Traffic Anomaly Detection Algorithm for Wireless Sensor Networks , 2016, Int. J. Distributed Sens. Networks.

[6]  Borja Bordel,et al.  Automatic Detection of Erratic Sensor Observations in Ami Platforms: A Statistical Approach † , 2019, UCAmI.

[7]  Aidong Men,et al.  A Hybrid Semi-Supervised Anomaly Detection Model for High-Dimensional Data , 2017, Comput. Intell. Neurosci..

[8]  Mohsen Guizani,et al.  Internet-of-things-based smart environments: state of the art, taxonomy, and open research challenges , 2016, IEEE Wireless Communications.

[9]  Jinoh Kim,et al.  A survey of deep learning-based network anomaly detection , 2017, Cluster Computing.

[10]  Zaher Mundher Yaseen,et al.  GuessCompx: An empirical complexity estimation in R , 2019, ArXiv.

[11]  Helena Rifà-Pous,et al.  A Comparative Study of Anomaly Detection Techniques for Smart City Wireless Sensor Networks , 2016, Sensors.

[12]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[13]  Xuelong Li,et al.  Efficient Outlier Detection for High-Dimensional Data , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[14]  Yehuda Vardi,et al.  A Hybrid High-Order Markov Chain Model for Computer Intrusion Detection , 2001 .

[15]  Ramón Alcarria,et al.  An Ambient Intelligence Framework for End-User Service Provisioning in a Hospital Pharmacy: a Case Study , 2015, Journal of Medical Systems.

[16]  David R. Kaeli,et al.  Accelerating the local outlier factor algorithm on a GPU for intrusion detection systems , 2010, GPGPU-3.

[17]  Pethuru Raj,et al.  Intelligent Cities: Enabling Tools and Technology , 2015 .

[18]  R. Kavasseri,et al.  Day-ahead wind speed forecasting using f-ARIMA models , 2009 .

[19]  Tomi Räty,et al.  Unsupervised online detection and prediction of outliers in streams of sensor data , 2019, International Journal of Data Science and Analytics.

[20]  Trevor Hastie,et al.  Generalized linear and generalized additive models in studies of species distributions: setting the scene , 2002 .

[21]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[22]  Enrico Motta,et al.  Smart Cities' Data: Challenges and Opportunities for Semantic Technologies , 2015, IEEE Internet Computing.

[23]  Jyotsna P. Gabhane,et al.  A survey based on Smart Homes system using Internet-of-Things , 2015, 2015 International Conference on Computation of Power, Energy, Information and Communication (ICCPEIC).

[24]  R. Tibshirani,et al.  Generalized Additive Models , 1986 .

[25]  M. Govindarajan An Outlier detection approach with data mining in wireless sensor network , 2014 .

[26]  J. Contreras,et al.  ARIMA Models to Predict Next-Day Electricity Prices , 2002, IEEE Power Engineering Review.

[27]  David Wheeler,et al.  Multicollinearity and correlation among local regression coefficients in geographically weighted regression , 2005, J. Geogr. Syst..

[28]  Md. Rafiqul Islam,et al.  A survey of anomaly detection techniques in financial domain , 2016, Future Gener. Comput. Syst..

[29]  R. Tibshirani,et al.  Generalized Additive Models: Some Applications , 1987 .