Compromising PCA-based Anomaly Detectors for Network-Wide Traffic

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. Acknowledgement We would like to thank Marco Barreno for his valuable suggestions. Abstract The use of machine learning techniques to improve network design is gaining popularity. When these techniques are applied to security problems, a fundamental problem arises; namely that they are susceptible to adversaries who poison the learning phase of such techniques. In this paper we focus on PCA-based anomaly detectors used to identify anomalies in backbone networks via a comprehensive view of the network's traffic. We present four data poisoning schemes and evaluate their effectiveness on increasing an attacker's chance of evading detection. Because machine learning techniques often require retraining when used on data that is evolving, this also opens the door for attackers to employ stealthy poisoning methods that perturb the PCA detector slowly and covertly over time. We demonstrate that some of these PCA-based attacks can increase the adversary's chance of success sixfold under relatively moderate attacks, and comment on possible directions for combating these types of attacks.