Detecting Cyber Attacks Using Anomaly Detection with Explanations and Expert Feedback

Detecting cyber attacks in large computer networks is crucial for many organizations. To that purpose, different types of detectors capture the important signals resembling a security attack from individual computers and bring that to the attention of a security analyst. Unfortunately, the analyst sometimes has no indications about why the particular computer was identified as being "under attack". In addition, the analyst may have no method to provide feedback to the detector if the computer was actually identified for some benign reason. In this paper, we use a state-of-the-art anomaly detector called an Isolation Forest [1] for attack detection and generate explanations about why the detector identified certain computers as anomalous. These explanations allow the analyst to direct their investigation in order to save time. We then take the feedback from the analyst in the form of true and false positives and update the anomaly detector to capture signals that align better with the given feedback. Our experiments on actual network data show that the explanations give more insight into the detections, and the analyst’s feedback increases the attack detection rate.

[1]  Thomas G. Dietterich,et al.  Systematic construction of anomaly detection benchmarks from real data , 2013, ODD '13.

[2]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[3]  James Bailey,et al.  Mining outlying aspects on numeric data , 2015, Data Mining and Knowledge Discovery.

[4]  Erhan Guven,et al.  A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection , 2016, IEEE Communications Surveys & Tutorials.

[5]  Debin Gao,et al.  Gray-box extraction of execution graphs for anomaly detection , 2004, CCS '04.

[6]  Lalu Banoth,et al.  A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection , 2017 .

[7]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[8]  Thomas G. Dietterich,et al.  Sequential Feature Explanations for Anomaly Detection , 2019, ACM Trans. Knowl. Discov. Data.

[9]  R. Sekar,et al.  A fast automaton-based method for detecting anomalous program behaviors , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[10]  Thomas G. Dietterich,et al.  Anomaly detection in the presence of missing values for weather data quality control , 2018, COMPASS.

[11]  Marius Kloft,et al.  Toward Supervised Anomaly Detection , 2014, J. Artif. Intell. Res..

[12]  James Bailey,et al.  Scalable Outlying-Inlying Aspects Discovery via Feature Ranking , 2015, PAKDD.

[13]  Ira Assent,et al.  Explaining Outliers by Subspace Separability , 2013, 2013 IEEE 13th International Conference on Data Mining.

[14]  Thomas G. Dietterich,et al.  Incorporating Feedback into Tree-based Anomaly Detection , 2017, ArXiv.

[15]  Tomás Pevný,et al.  Learning combination of anomaly detectors for security domain , 2016, Comput. Networks.

[16]  Thomas G. Dietterich,et al.  Feedback-Guided Anomaly Discovery via Online Optimization , 2018, KDD.

[17]  Sumeet Dua,et al.  Data Mining and Machine Learning in Cybersecurity , 2011 .

[18]  Naren Ramakrishnan,et al.  Unearthing Stealthy Program Attacks Buried in Extremely Long Execution Paths , 2015, CCS.

[19]  Kalyan Veeramachaneni,et al.  AI^2: Training a Big Data Machine to Defend , 2016, 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS).

[20]  Thomas G. Dietterich,et al.  Incorporating Expert Feedback into Active Anomaly Discovery , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).