RAD: On-line Anomaly Detection for Highly Unreliable Data

Classification algorithms have been widely adopted to detect anomalies for various systems, e.g., IoT, cloud and face recognition, under the common assumption that the data source is clean, i.e., features and labels are correctly set. However, data collected from the wild can be unreliable due to careless annotations or malicious data transformation for incorrect anomaly detection. In this paper, we present a two-layer on-line learning framework for robust anomaly detection (RAD) in the presence of unreliable anomaly labels, where the first layer is to filter out the suspicious data, and the second layer detects the anomaly patterns from the remaining data. To adapt to the on-line nature of anomaly detection, we extend RAD with additional features of repetitively cleaning, conflicting opinions of classifiers, and oracle knowledge. We on-line learn from the incoming data streams and continuously cleanse the data, so as to adapt to the increasing learning capacity from the larger accumulated data set. Moreover, we explore the concept of oracle learning that provides additional information of true labels for difficult data points. We specifically focus on three use cases, (i) detecting 10 classes of IoT attacks, (ii) predicting 4 classes of task failures of big data jobs, (iii) recognising 20 celebrities faces. Our evaluation results show that RAD can robustly improve the accuracy of anomaly detection, to reach up to 98% for IoT device attacks (i.e., +11%), up to 84% for cloud task failures (i.e., +20%) under 40% noise, and up to 74% for face recognition (i.e., +28%) under 30% noisy labels. The proposed RAD is general and can be applied to different anomaly detection algorithms.

[1]  Ashish Khetan,et al.  Robustness of Conditional GANs to Noisy Labels , 2018, NeurIPS.

[2]  Chien Chin Chen,et al.  A Study of Machine Learning Models in Epidemic Surveillance: Using the Query Logs of Search Engines , 2010, PACIS.

[3]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Hung-Yu Kao,et al.  Data-Driven and Deep Learning Methodology for Deceptive Advertising and Phone Scams Detection , 2017, 2017 Conference on Technologies and Applications of Artificial Intelligence (TAAI).

[5]  Sophie Cerf,et al.  Robust Anomaly Detection on Unreliable Data , 2019, 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[6]  Jin Wei,et al.  Real-Time Detection of False Data Injection Attacks in Smart Grid: A Deep Learning-Based Intelligent Mechanism , 2017, IEEE Transactions on Smart Grid.

[7]  Joan Bruna,et al.  Training Convolutional Networks with Noisy Labels , 2014, ICLR 2014.

[8]  Michal Choras,et al.  A scalable distributed machine learning approach for attack detection in edge computing environments , 2018, J. Parallel Distributed Comput..

[9]  Ernesto Costa,et al.  Exploratory Study of Machine Learning Techniques for Supporting Failure Prediction , 2018, 2018 14th European Dependable Computing Conference (EDCC).

[10]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Tony R. Martinez,et al.  Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.

[12]  Evgenia Smirni,et al.  Virtualization in the Private Cloud: State of the Practice , 2016, IEEE Transactions on Network and Service Management.

[13]  Antonios Argyriou,et al.  Jamming attack detection in a pair of RF communicating vehicles using unsupervised machine learning , 2018, Veh. Commun..

[14]  Xing Ji,et al.  CosFace: Large Margin Cosine Loss for Deep Face Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Nagarajan Natarajan,et al.  Learning with Noisy Labels , 2013, NIPS.

[16]  Abhinav Gupta,et al.  Learning from Noisy Large-Scale Datasets with Minimal Supervision , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Dimiter R. Avresky,et al.  A Machine Learning-Based Framework for Building Application Failure Prediction Models , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop.

[18]  Yale Song,et al.  Learning from Noisy Labels with Distillation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Pan Hui,et al.  A Decomposition Approach for Urban Anomaly Detection Across Spatiotemporal Data , 2019, IJCAI.

[20]  Youping Fan,et al.  A Method for Identifying Critical Elements of a Cyber-Physical System Under Data Attack , 2018, IEEE Access.

[21]  Andrea Rosà,et al.  Failure Analysis and Prediction for Big-Data Systems , 2017, IEEE Transactions on Services Computing.

[22]  Robert Birke,et al.  Failure Analysis of Virtual and Physical Machines: Patterns, Causes and Characteristics , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[23]  Lars Kai Hansen,et al.  Design of robust neural network classifiers , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[24]  Zhetao Li,et al.  Machine-Learning-Based Online Distributed Denial-of-Service Attack Detection Using Spark Streaming , 2018, 2018 IEEE International Conference on Communications (ICC).

[25]  Evgenia Smirni,et al.  Spatial–Temporal Prediction Models for Active Ticket Managing in Data Centers , 2018, IEEE Transactions on Network and Service Management.

[26]  Andrea Rosà,et al.  Understanding the Dark Side of Big Data Clusters: An Analysis beyond Failures , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[27]  Lars Grunske,et al.  A comparison of machine learning algorithms for proactive hard disk drive failure detection , 2013, ISARCS '13.

[28]  Ravishankar K. Iyer,et al.  Failure Diagnosis for Distributed Systems Using Targeted Fault Injection , 2017, IEEE Transactions on Parallel and Distributed Systems.

[29]  Santosh Biswas,et al.  Machine learning approach for detection of flooding DoS attacks in 802.11 networks and attacker localization , 2014, International Journal of Machine Learning and Cybernetics.

[30]  Stavros Tripakis,et al.  Learning Moore machines from input–output traces , 2016, International Journal on Software Tools for Technology Transfer.

[31]  Kevin Gimpel,et al.  Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise , 2018, NeurIPS.

[32]  Mangui Liang,et al.  Fuzzy support vector machine based on within-class scatter for classification problems with outliers or noises , 2013, Neurocomputing.

[33]  Dae-Hyun Choi,et al.  False Data Injection Attacks on Contingency Analysis: Attack Strategies and Impact Assessment , 2018, IEEE Access.

[34]  Stefan Winkler,et al.  A data-driven approach to cleaning large face datasets , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[35]  Arash Vahdat,et al.  Toward Robustness against Label Noise in Training Deep Discriminative Neural Networks , 2017, NIPS.

[36]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[37]  Alexander Pretschner,et al.  Predicting the Resilience of Obfuscated Code Against Symbolic Execution Attacks via Machine Learning , 2017, USENIX Security Symposium.

[38]  Uwe Reuter,et al.  A comparative study of machine learning approaches for modeling concrete failure surfaces , 2018, Adv. Eng. Softw..

[39]  Rosni Abdullah,et al.  A Machine Learning Approach to Detect Router Advertisement Flooding Attacks in Next-Generation IPv6 Networks , 2018, Cognitive Computation.

[40]  Richard Nock,et al.  Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[42]  Marina V. Fomina,et al.  Problem of knowledge discovery in noisy databases , 2011, Int. J. Mach. Learn. Cybern..

[43]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Blaine Nelson,et al.  Support Vector Machines Under Adversarial Label Noise , 2011, ACML.

[45]  Yuval Elovici,et al.  N-BaIoT—Network-Based Detection of IoT Botnet Attacks Using Deep Autoencoders , 2018, IEEE Pervasive Computing.

[46]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.